Your Child's Data Profile Starts Before They're Born | Eamonn Maguire of Proton
Episode
55 min
Read time
2 min
Topics
Science & Discovery
AI-Generated Summary
Key Takeaways
- ✓Pre-birth data profiling: The moment a parent emails a gynecologist or fertility clinic using Gmail or Outlook, advertising platforms flag that household as expecting and begin building a child's profile before birth. Switching to end-to-end encrypted email like ProtonMail at the start of a pregnancy prevents this data from entering ad-targeting systems entirely.
- ✓AI training data opacity: Only 0.3% of GPT-2's training data came from the entire English-language Wikipedia. The remainder was scraped web pages, social media, and unattributed sources. Anthropic faced a $1.5 billion lawsuit for scanning thousands of purchased books then discarding them to eliminate copyright paper trails — a pattern users should factor into trust decisions.
- ✓Profile inference from minimal data: Three email sign-ups — Instagram, a political newsletter, and an AI publication — are sufficient for platforms to infer age, ideology, and interests, then expand the profile by serving targeted ads and measuring click behavior. Non-clicks on religious or political content are themselves used to fill profile gaps.
- ✓Open vs. open-washed AI models: Proton's Lumo assistant deploys genuinely open models — including GLM 5.1, Qwen 3.5, and NVIDIA's Nematron series — where training data, code, and architecture are all publicly verifiable. Models labeled open-source but with undisclosed training data, such as Meta's Llama, are described as "open-washing" and carry the same trust risks as proprietary systems.
- ✓Privacy-preserving AI within encrypted environments: Proton implements local indexing of Drive folders linked to Lumo projects, enabling retrieval-augmented generation without sending documents to external servers. Users can disable web search APIs entirely if their threat model requires it, and all chat history is end-to-end encrypted with user-held keys, making server-side data access structurally impossible.
What It Covers
Eamonn Maguire of Proton explains how data profiling begins before a child is born, how AI models are trained on scraped data without consent, and how Proton's ecosystem — including Lumo AI, encrypted email, and the Born Private initiative — offers a structural alternative to surveillance-based platforms.
Key Questions Answered
- •Pre-birth data profiling: The moment a parent emails a gynecologist or fertility clinic using Gmail or Outlook, advertising platforms flag that household as expecting and begin building a child's profile before birth. Switching to end-to-end encrypted email like ProtonMail at the start of a pregnancy prevents this data from entering ad-targeting systems entirely.
- •AI training data opacity: Only 0.3% of GPT-2's training data came from the entire English-language Wikipedia. The remainder was scraped web pages, social media, and unattributed sources. Anthropic faced a $1.5 billion lawsuit for scanning thousands of purchased books then discarding them to eliminate copyright paper trails — a pattern users should factor into trust decisions.
- •Profile inference from minimal data: Three email sign-ups — Instagram, a political newsletter, and an AI publication — are sufficient for platforms to infer age, ideology, and interests, then expand the profile by serving targeted ads and measuring click behavior. Non-clicks on religious or political content are themselves used to fill profile gaps.
- •Open vs. open-washed AI models: Proton's Lumo assistant deploys genuinely open models — including GLM 5.1, Qwen 3.5, and NVIDIA's Nematron series — where training data, code, and architecture are all publicly verifiable. Models labeled open-source but with undisclosed training data, such as Meta's Llama, are described as "open-washing" and carry the same trust risks as proprietary systems.
- •Privacy-preserving AI within encrypted environments: Proton implements local indexing of Drive folders linked to Lumo projects, enabling retrieval-augmented generation without sending documents to external servers. Users can disable web search APIs entirely if their threat model requires it, and all chat history is end-to-end encrypted with user-held keys, making server-side data access structurally impossible.
Notable Moment
Maguire describes how platforms actively probe unknown profile attributes — such as religion or political affiliation — by serving targeted ads and measuring non-clicks as data points. The absence of engagement is itself recorded, meaning passive scrolling still continuously fills gaps in a user's behavioral profile.
You just read a 3-minute summary of a 52-minute episode.
Get Eye on AI summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Eye on AI
The App of the Future Is Voice — Not a Screen. Mitel's CTO Luiz Domingos Explains Why.
May 28 · 54 min
Up First (NPR)
Israel Ramps Up Attacks Amid Iran Talks, E. Jean Carroll Investigation, CBS Overhaul
May 29
More from Eye on AI
Is ChatGPT Conscious? A Pioneer of AI Explains | Dr. Terry Sejnowski
May 28 · 56 min
The Daily (NYT)
Stranded in the Strait of Hormuz
May 29
More from Eye on AI
We summarize every new episode. Want them in your inbox?
The App of the Future Is Voice — Not a Screen. Mitel's CTO Luiz Domingos Explains Why.
Is ChatGPT Conscious? A Pioneer of AI Explains | Dr. Terry Sejnowski
Training AI Models Without a Billion-Dollar Data Center | Steffen Cruz of Macrocosmos
The Single Biggest Barrier to AI Adoption Isn't the Technology — It's This | Errol Gardner of EY
Oliver Dial of IBM: Quantum Advantage Is Happening This Year
Similar Episodes
Related episodes from other podcasts
Up First (NPR)
May 29
Israel Ramps Up Attacks Amid Iran Talks, E. Jean Carroll Investigation, CBS Overhaul
The Daily (NYT)
May 29
Stranded in the Strait of Hormuz
10% Happier with Dan Harris
May 29
Anxiety Narrows Your Brain. Here's How to Widen It Back Out. | Susa Talan
Feel Better, Live More
May 28
BITESIZE | The 5 Minute Habits That Can Transform Your Health | Dr Rangan Chatterjee and Dr Ayan Panja #661
The Tim Ferriss Show
May 28
#867: Dr. Becky Kennedy — Parenting Strategies for Raising Resilient Kids, Plus Word-for-Word Scripts for Repairing Relationships, Setting Boundaries, and More (Repost)
Explore Related Topics
This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.
You're clearly into Eye on AI.
Every Monday, we deliver AI summaries of the latest episodes from Eye on AI and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime