What are the key takeaways from this Eye on AI episode?

Key insights include: **Pre-birth data profiling:** The moment a parent emails a gynecologist or fertility clinic using Gmail or Outlook, advertising platforms flag that household as expecting and begin building a child's profile before birth. Switching to end-to-end encrypted email like ProtonMail at the start of a pregnancy prevents this data from entering ad-targeting systems entirely.; **AI training data opacity:** Only 0.3% of GPT-2's training data came from the entire English-language Wikipedia. The remainder was scraped web pages, social media, and unattributed sources. Anthropic faced a $1.5 billion lawsuit for scanning thousands of purchased books then discarding them to eliminate copyright paper trails — a pattern users should factor into trust decisions.; **Profile inference from minimal data:** Three email sign-ups — Instagram, a political newsletter, and an AI publication — are sufficient for platforms to infer age, ideology, and interests, then expand the profile by serving targeted ads and measuring click behavior. Non-clicks on religious or political content are themselves used to fill profile gaps.

What did Eamonn Maguire discuss on Eye on AI?

Eamonn Maguire of Proton explains how data profiling begins before a child is born, how AI models are trained on scraped data without consent, and how Proton's ecosystem — including Lumo AI, encrypted email, and the Born Private initiative — offers a structural alternative to surveillance-based platforms. Key topics include: **Pre-birth data profiling:** The moment a parent emails a gynecologist or fertility clinic using Gmail or Outlook, advertising platforms flag that household as expecting and begin building a child's profile before birth. Switching to end-to-end encrypted email like ProtonMail at the start of a pregnancy prevents this data from entering ad-targeting systems entirely.; **AI training data opacity:** Only 0.3% of GPT-2's training data came from the entire English-language Wikipedia. The remainder was scraped web pages, social media, and unattributed sources. Anthropic faced a $1.5 billion lawsuit for scanning thousands of purchased books then discarding them to eliminate copyright paper trails — a pattern users should factor into trust decisions..

How long is this episode of Eye on AI?

This episode is 55 minutes long. SignalCast provides an AI-generated summary so you can get the key insights in about 3 minutes.

Eye on AI

Your Child's Data Profile Starts Before They're Born | Eamonn Maguire of Proton

May 28, 2026

55 min episode · 2 min read

Eamonn Maguire

Episode

55 min

Read time

2 min

Topics

Fundraising & VC, Marketing, Artificial Intelligence

AI-Generated Summary

Published May 29, 2026

Key Takeaways

✓Pre-birth data profiling: The moment a parent emails a gynecologist or fertility clinic using Gmail or Outlook, advertising platforms flag that household as expecting and begin building a child's profile before birth. Switching to end-to-end encrypted email like ProtonMail at the start of a pregnancy prevents this data from entering ad-targeting systems entirely.
✓AI training data opacity: Only 0.3% of GPT-2's training data came from the entire English-language Wikipedia. The remainder was scraped web pages, social media, and unattributed sources. Anthropic faced a $1.5 billion lawsuit for scanning thousands of purchased books then discarding them to eliminate copyright paper trails — a pattern users should factor into trust decisions.
✓Profile inference from minimal data: Three email sign-ups — Instagram, a political newsletter, and an AI publication — are sufficient for platforms to infer age, ideology, and interests, then expand the profile by serving targeted ads and measuring click behavior. Non-clicks on religious or political content are themselves used to fill profile gaps.
✓Open vs. open-washed AI models: Proton's Lumo assistant deploys genuinely open models — including GLM 5.1, Qwen 3.5, and NVIDIA's Nematron series — where training data, code, and architecture are all publicly verifiable. Models labeled open-source but with undisclosed training data, such as Meta's Llama, are described as "open-washing" and carry the same trust risks as proprietary systems.
✓Privacy-preserving AI within encrypted environments: Proton implements local indexing of Drive folders linked to Lumo projects, enabling retrieval-augmented generation without sending documents to external servers. Users can disable web search APIs entirely if their threat model requires it, and all chat history is end-to-end encrypted with user-held keys, making server-side data access structurally impossible.

What It Covers

Eamonn Maguire of Proton explains how data profiling begins before a child is born, how AI models are trained on scraped data without consent, and how Proton's ecosystem — including Lumo AI, encrypted email, and the Born Private initiative — offers a structural alternative to surveillance-based platforms.

Key Questions Answered

•Pre-birth data profiling: The moment a parent emails a gynecologist or fertility clinic using Gmail or Outlook, advertising platforms flag that household as expecting and begin building a child's profile before birth. Switching to end-to-end encrypted email like ProtonMail at the start of a pregnancy prevents this data from entering ad-targeting systems entirely.
•AI training data opacity: Only 0.3% of GPT-2's training data came from the entire English-language Wikipedia. The remainder was scraped web pages, social media, and unattributed sources. Anthropic faced a $1.5 billion lawsuit for scanning thousands of purchased books then discarding them to eliminate copyright paper trails — a pattern users should factor into trust decisions.
•Profile inference from minimal data: Three email sign-ups — Instagram, a political newsletter, and an AI publication — are sufficient for platforms to infer age, ideology, and interests, then expand the profile by serving targeted ads and measuring click behavior. Non-clicks on religious or political content are themselves used to fill profile gaps.
•Open vs. open-washed AI models: Proton's Lumo assistant deploys genuinely open models — including GLM 5.1, Qwen 3.5, and NVIDIA's Nematron series — where training data, code, and architecture are all publicly verifiable. Models labeled open-source but with undisclosed training data, such as Meta's Llama, are described as "open-washing" and carry the same trust risks as proprietary systems.
•Privacy-preserving AI within encrypted environments: Proton implements local indexing of Drive folders linked to Lumo projects, enabling retrieval-augmented generation without sending documents to external servers. Users can disable web search APIs entirely if their threat model requires it, and all chat history is end-to-end encrypted with user-held keys, making server-side data access structurally impossible.

Notable Moment

Maguire describes how platforms actively probe unknown profile attributes — such as religion or political affiliation — by serving targeted ads and measuring non-clicks as data points. The absence of engagement is itself recorded, meaning passive scrolling still continuously fills gaps in a user's behavioral profile.

Know someone who'd find this useful?

You just read a 3-minute summary of a 52-minute episode.

Get Eye on AI summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Books, tools, and gear mentioned in this episode

SignalCast may earn commission on purchases via these links.

Tools

Proton DriveRecommendedBy guest
by Proton
“Proton implements local indexing of Drive folders linked to Lumo projects, enabling retrieval-augmented generation without sending documents to external servers.”
Lumo AIRecommendedBy guest
by Proton
“Proton's ecosystem — including Lumo AI, encrypted email, and the Born Private initiative — offers a structural alternative to surveillance-based platforms.”
ProtonMailRecommendedBy guest
by Proton
“Switching to end-to-end encrypted email like ProtonMail at the start of a pregnancy prevents this data from entering ad-targeting systems entirely.”

other

Born PrivateRecommendedBy guest
by Proton
“Proton's ecosystem — including Lumo AI, encrypted email, and the Born Private initiative — offers a structural alternative to surveillance-based platforms.”

Similar Episodes

Related episodes from other podcasts

The Joe Rogan Experience

Mar 3

Explore Related Topics

💰Fundraising & VC 📣Marketing 🤖Artificial Intelligence

This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.

Read this week's AI & Machine Learning Podcast Insights — cross-podcast analysis updated weekly.

You're clearly into Eye on AI.

Every Monday, we deliver AI summaries of the latest episodes from Eye on AI and 192+ other podcasts. Free for one show.

Start My Monday Digest

No credit card · Unsubscribe anytime

Your Child's Data Profile Starts Before They're Born | Eamonn Maguire of Proton

AI-Generated Summary

Key Takeaways

What It Covers

Key Questions Answered

Notable Moment

Keep Reading

What Industrial AI Actually Looks Like | Kriti Sharma, Nexus Black

#2462 - Aaron Siri

The Biggest AI Security Problem Isn't the Model. It's This. | Devvret Rishi

Is Privacy A Winnable Battle? | Andy Yen, Founder of Proton

Books, tools, and gear mentioned in this episode

Tools

other

More from Eye on AI

What Industrial AI Actually Looks Like | Kriti Sharma, Nexus Black

The Biggest AI Security Problem Isn't the Model. It's This. | Devvret Rishi

Big Pharma Fails 50% of the Time in Phase Three. AI Can Fix That | Vin Singh, BullFrog AI

AI Agents Are Failing and It's Almost Never the Model's Fault | Alberto Pan, Denodo

How Modern Science Got Consciousness Wrong From the Start | Philip Goff

Similar Episodes

#2462 - Aaron Siri

Is Privacy A Winnable Battle? | Andy Yen, Founder of Proton

How AI Learns to Smell with Alex Wiltschko - #771

The Hidden Part of You That's Blocking Everything You Want | Katie Clarke

RWH069: The Psychology of Investing w/ Emily Haisley

Explore Related Topics

You're clearly into Eye on AI.