Milliseconds to Match: Criteo's AdTech AI & the Future of Commerce w/ Diarmuid Gill & Liva Ralaivola
Episode
87 min
Read time
3 min
Topics
Artificial Intelligence
AI-Generated Summary
Key Takeaways
- ✓Real-time bidding architecture: Criteo pre-computes user and product embeddings offline, reducing runtime inference to a vector similarity comparison executed in milliseconds. The system ingests product data from 17,000 retailers daily—sometimes multiple times per day—ensuring pricing, stock levels, and catalog accuracy that static LLM training data cannot provide. This hybrid of offline computation and live data refresh is the core technical moat enabling sub-millisecond ad decisions at billions of daily transactions.
- ✓Foundation model strategy: Rather than building one monolithic model, Criteo operates three to four specialized foundation models that generate embeddings for products, user timelines, and contextual signals separately. These embeddings are made available company-wide as reusable inputs, allowing new product teams to warm-start models instead of training from scratch. A recent internal hackathon validated this approach, with multiple teams achieving faster performance gains by plugging into existing embedding infrastructure rather than building new feature pipelines.
- ✓Feature evolution from sparse to dense: Criteo's modeling progressed from sparse binary vectors of up to 2^20 dimensions fed into logistic regression, to dense embeddings of 200–1,000 dimensions computed automatically via their proprietary Deep KNN algorithm. This shift eliminated manual feature engineering, which became unsustainable as cookie signals and data sources changed. The AI Lab, founded in 2018 specifically to drive this transition, now publishes the methodology publicly, including training loss functions and model architectures, in academic papers and technical blogs.
- ✓LLM partnership fills a specific gap: LLMs excel at general reasoning and natural language product queries but become stale immediately after training—missing flash sales, stock outages, and price changes. Criteo's OpenAI partnership addresses this by routing product queries through Criteo's live commerce data layer via MCP protocols, giving ChatGPT accurate real-time inventory context. The emerging agentic protocol standard makes this integration significantly easier than previous surface-by-surface API customization, reducing deployment complexity across chat interfaces and web surfaces simultaneously.
- ✓Privacy architecture as competitive advantage: Criteo stores no personally identifiable information—only anonymous random cookie IDs paired with behavioral signals like product views and purchase history, roughly 150 features per profile. Built under European GDPR constraints from inception, Criteo applies the same privacy-compliant tech stack globally rather than maintaining separate regional systems. This single-stack approach means US advertisers receive the same data handling as EU users, and Criteo pioneered the AdChoices opt-out icon before regulatory mandates required it.
What It Covers
Criteo CTO Diarmuid Gill and AI Lab VP Liva Ralaivola explain how their ad tech platform processes over one billion user profiles in milliseconds using cached embeddings and multiple foundation models, while exploring how their OpenAI partnership combines real-time commerce data from 17,000 retailers with LLM reasoning to power next-generation product discovery.
Key Questions Answered
- •Real-time bidding architecture: Criteo pre-computes user and product embeddings offline, reducing runtime inference to a vector similarity comparison executed in milliseconds. The system ingests product data from 17,000 retailers daily—sometimes multiple times per day—ensuring pricing, stock levels, and catalog accuracy that static LLM training data cannot provide. This hybrid of offline computation and live data refresh is the core technical moat enabling sub-millisecond ad decisions at billions of daily transactions.
- •Foundation model strategy: Rather than building one monolithic model, Criteo operates three to four specialized foundation models that generate embeddings for products, user timelines, and contextual signals separately. These embeddings are made available company-wide as reusable inputs, allowing new product teams to warm-start models instead of training from scratch. A recent internal hackathon validated this approach, with multiple teams achieving faster performance gains by plugging into existing embedding infrastructure rather than building new feature pipelines.
- •Feature evolution from sparse to dense: Criteo's modeling progressed from sparse binary vectors of up to 2^20 dimensions fed into logistic regression, to dense embeddings of 200–1,000 dimensions computed automatically via their proprietary Deep KNN algorithm. This shift eliminated manual feature engineering, which became unsustainable as cookie signals and data sources changed. The AI Lab, founded in 2018 specifically to drive this transition, now publishes the methodology publicly, including training loss functions and model architectures, in academic papers and technical blogs.
- •LLM partnership fills a specific gap: LLMs excel at general reasoning and natural language product queries but become stale immediately after training—missing flash sales, stock outages, and price changes. Criteo's OpenAI partnership addresses this by routing product queries through Criteo's live commerce data layer via MCP protocols, giving ChatGPT accurate real-time inventory context. The emerging agentic protocol standard makes this integration significantly easier than previous surface-by-surface API customization, reducing deployment complexity across chat interfaces and web surfaces simultaneously.
- •Privacy architecture as competitive advantage: Criteo stores no personally identifiable information—only anonymous random cookie IDs paired with behavioral signals like product views and purchase history, roughly 150 features per profile. Built under European GDPR constraints from inception, Criteo applies the same privacy-compliant tech stack globally rather than maintaining separate regional systems. This single-stack approach means US advertisers receive the same data handling as EU users, and Criteo pioneered the AdChoices opt-out icon before regulatory mandates required it.
- •Generative creative democratizes long-tail advertising: Historically, mid-to-long-tail advertisers were excluded from high-quality creative campaigns due to production costs. Criteo's self-service product Criteo Gold, combined with generative AI partners like Waymark, now enables smaller advertisers to produce campaign-quality creative assets. Dynamic creative optimization assembles pre-generated visual assets at runtime rather than rendering full generative outputs live—current generative video latency remains too high for real-time ad serving, but the modular assembly approach bridges the gap until on-device rendering speeds improve within an estimated two to three years.
Notable Moment
Liva Ralaivola proposed a future advertising model where users actively instruct their AI assistants to evaluate a fixed number of options—say, five shoes or five travel packages—and request curated ad exposure on their own terms. This reframes advertising not as interruption but as a user-initiated, agent-mediated discovery service, collapsing the boundary between search and advertising entirely.
You just read a 3-minute summary of a 84-minute episode.
Get Cognitive Revolution summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Cognitive Revolution
"Descript Isn't a Slop Machine": Laura Burkhauser on the AI Tools Creators Love and Hate
May 6 · 83 min
This Week in Startups
5,000+ Tech Workers Laid Off This Week. It's Just The Beginning. | E2286
May 9
More from Cognitive Revolution
The RL Fine-Tuning Playbook: CoreWeave's Kyle Corbitt on GRPO, Rubrics, Environments, Reward Hacking
May 1 · 106 min
Mind Pump: Raw Fitness Truth
2854: The Optimal Sets & Reps at Every Intensity ! Soviet Science Explains
May 9
More from Cognitive Revolution
We summarize every new episode. Want them in your inbox?
"Descript Isn't a Slop Machine": Laura Burkhauser on the AI Tools Creators Love and Hate
The RL Fine-Tuning Playbook: CoreWeave's Kyle Corbitt on GRPO, Rubrics, Environments, Reward Hacking
AI in the AM: 99% off search, GPT-5.5 is "clean", model welfare analysis, & efficient analog compute
Does Learning Require Feeling? Cameron Berg on the latest AI Consciousness & Welfare Research
Vibe-Coding an Attention Firewall, w/ Steve Newman, creator of The Curve
Similar Episodes
Related episodes from other podcasts
This Week in Startups
May 9
5,000+ Tech Workers Laid Off This Week. It's Just The Beginning. | E2286
Mind Pump: Raw Fitness Truth
May 9
2854: The Optimal Sets & Reps at Every Intensity ! Soviet Science Explains
All-In with Chamath, Jason, Sacks & Friedberg
May 8
Elon's Anthropic Deal, The Next AI Monopoly?, "FDA for AI" Panic, Trading the AI Boom
The AI Breakdown
May 8
The Week the AI Story Shifted
The Startup Ideas Podcast
May 8
Hire a team of AI Agents
Explore Related Topics
This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.
Read this week's AI & Machine Learning Podcast Insights — cross-podcast analysis updated weekly.
You're clearly into Cognitive Revolution.
Every Monday, we deliver AI summaries of the latest episodes from Cognitive Revolution and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime