Milliseconds to Match: Criteo's AdTech AI & the Future of Commerce w/ Diarmuid Gill & Liva Ralaivola

May 9, 2026

87 min episode · 3 min read

Diarmuid Gill,Liva Ralaivola,Alex Persky Stern

Episode

87 min

Read time

3 min

Topics

Artificial Intelligence

AI-Generated Summary

Published May 9, 2026

Key Takeaways

✓Real-time bidding architecture: Criteo pre-computes user and product embeddings offline, reducing runtime inference to a vector similarity comparison executed in milliseconds. The system ingests product data from 17,000 retailers daily—sometimes multiple times per day—ensuring pricing, stock levels, and catalog accuracy that static LLM training data cannot provide. This hybrid of offline computation and live data refresh is the core technical moat enabling sub-millisecond ad decisions at billions of daily transactions.
✓Foundation model strategy: Rather than building one monolithic model, Criteo operates three to four specialized foundation models that generate embeddings for products, user timelines, and contextual signals separately. These embeddings are made available company-wide as reusable inputs, allowing new product teams to warm-start models instead of training from scratch. A recent internal hackathon validated this approach, with multiple teams achieving faster performance gains by plugging into existing embedding infrastructure rather than building new feature pipelines.
✓Feature evolution from sparse to dense: Criteo's modeling progressed from sparse binary vectors of up to 2^20 dimensions fed into logistic regression, to dense embeddings of 200–1,000 dimensions computed automatically via their proprietary Deep KNN algorithm. This shift eliminated manual feature engineering, which became unsustainable as cookie signals and data sources changed. The AI Lab, founded in 2018 specifically to drive this transition, now publishes the methodology publicly, including training loss functions and model architectures, in academic papers and technical blogs.
✓LLM partnership fills a specific gap: LLMs excel at general reasoning and natural language product queries but become stale immediately after training—missing flash sales, stock outages, and price changes. Criteo's OpenAI partnership addresses this by routing product queries through Criteo's live commerce data layer via MCP protocols, giving ChatGPT accurate real-time inventory context. The emerging agentic protocol standard makes this integration significantly easier than previous surface-by-surface API customization, reducing deployment complexity across chat interfaces and web surfaces simultaneously.
✓Privacy architecture as competitive advantage: Criteo stores no personally identifiable information—only anonymous random cookie IDs paired with behavioral signals like product views and purchase history, roughly 150 features per profile. Built under European GDPR constraints from inception, Criteo applies the same privacy-compliant tech stack globally rather than maintaining separate regional systems. This single-stack approach means US advertisers receive the same data handling as EU users, and Criteo pioneered the AdChoices opt-out icon before regulatory mandates required it.

What It Covers

Criteo CTO Diarmuid Gill and AI Lab VP Liva Ralaivola explain how their ad tech platform processes over one billion user profiles in milliseconds using cached embeddings and multiple foundation models, while exploring how their OpenAI partnership combines real-time commerce data from 17,000 retailers with LLM reasoning to power next-generation product discovery.

Key Questions Answered

•Real-time bidding architecture: Criteo pre-computes user and product embeddings offline, reducing runtime inference to a vector similarity comparison executed in milliseconds. The system ingests product data from 17,000 retailers daily—sometimes multiple times per day—ensuring pricing, stock levels, and catalog accuracy that static LLM training data cannot provide. This hybrid of offline computation and live data refresh is the core technical moat enabling sub-millisecond ad decisions at billions of daily transactions.
•Foundation model strategy: Rather than building one monolithic model, Criteo operates three to four specialized foundation models that generate embeddings for products, user timelines, and contextual signals separately. These embeddings are made available company-wide as reusable inputs, allowing new product teams to warm-start models instead of training from scratch. A recent internal hackathon validated this approach, with multiple teams achieving faster performance gains by plugging into existing embedding infrastructure rather than building new feature pipelines.
•Feature evolution from sparse to dense: Criteo's modeling progressed from sparse binary vectors of up to 2^20 dimensions fed into logistic regression, to dense embeddings of 200–1,000 dimensions computed automatically via their proprietary Deep KNN algorithm. This shift eliminated manual feature engineering, which became unsustainable as cookie signals and data sources changed. The AI Lab, founded in 2018 specifically to drive this transition, now publishes the methodology publicly, including training loss functions and model architectures, in academic papers and technical blogs.
•LLM partnership fills a specific gap: LLMs excel at general reasoning and natural language product queries but become stale immediately after training—missing flash sales, stock outages, and price changes. Criteo's OpenAI partnership addresses this by routing product queries through Criteo's live commerce data layer via MCP protocols, giving ChatGPT accurate real-time inventory context. The emerging agentic protocol standard makes this integration significantly easier than previous surface-by-surface API customization, reducing deployment complexity across chat interfaces and web surfaces simultaneously.
•Privacy architecture as competitive advantage: Criteo stores no personally identifiable information—only anonymous random cookie IDs paired with behavioral signals like product views and purchase history, roughly 150 features per profile. Built under European GDPR constraints from inception, Criteo applies the same privacy-compliant tech stack globally rather than maintaining separate regional systems. This single-stack approach means US advertisers receive the same data handling as EU users, and Criteo pioneered the AdChoices opt-out icon before regulatory mandates required it.
•Generative creative democratizes long-tail advertising: Historically, mid-to-long-tail advertisers were excluded from high-quality creative campaigns due to production costs. Criteo's self-service product Criteo Gold, combined with generative AI partners like Waymark, now enables smaller advertisers to produce campaign-quality creative assets. Dynamic creative optimization assembles pre-generated visual assets at runtime rather than rendering full generative outputs live—current generative video latency remains too high for real-time ad serving, but the modular assembly approach bridges the gap until on-device rendering speeds improve within an estimated two to three years.

Notable Moment

Liva Ralaivola proposed a future advertising model where users actively instruct their AI assistants to evaluate a fixed number of options—say, five shoes or five travel packages—and request curated ad exposure on their own terms. This reframes advertising not as interruption but as a user-initiated, agent-mediated discovery service, collapsing the boundary between search and advertising entirely.

Know someone who'd find this useful?

You just read a 3-minute summary of a 84-minute episode.

Get Cognitive Revolution summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Similar Episodes

Related episodes from other podcasts

This Week in Startups

May 9

Explore Related Topics

🤖Artificial Intelligence

This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.

Read this week's AI & Machine Learning Podcast Insights — cross-podcast analysis updated weekly.

You're clearly into Cognitive Revolution.

Every Monday, we deliver AI summaries of the latest episodes from Cognitive Revolution and 192+ other podcasts. Free for up to 3 shows.

Start My Monday Digest

No credit card · Unsubscribe anytime

Milliseconds to Match: Criteo's AdTech AI & the Future of Commerce w/ Diarmuid Gill & Liva Ralaivola

AI-Generated Summary

Key Takeaways

What It Covers

Key Questions Answered

Notable Moment

Keep Reading

"Descript Isn't a Slop Machine": Laura Burkhauser on the AI Tools Creators Love and Hate

5,000+ Tech Workers Laid Off This Week. It's Just The Beginning. | E2286

The RL Fine-Tuning Playbook: CoreWeave's Kyle Corbitt on GRPO, Rubrics, Environments, Reward Hacking

2854: The Optimal Sets & Reps at Every Intensity ! Soviet Science Explains

More from Cognitive Revolution

"Descript Isn't a Slop Machine": Laura Burkhauser on the AI Tools Creators Love and Hate

The RL Fine-Tuning Playbook: CoreWeave's Kyle Corbitt on GRPO, Rubrics, Environments, Reward Hacking

AI in the AM: 99% off search, GPT-5.5 is "clean", model welfare analysis, & efficient analog compute

Does Learning Require Feeling? Cameron Berg on the latest AI Consciousness & Welfare Research

Vibe-Coding an Attention Firewall, w/ Steve Newman, creator of The Curve

Similar Episodes

5,000+ Tech Workers Laid Off This Week. It's Just The Beginning. | E2286

2854: The Optimal Sets & Reps at Every Intensity ! Soviet Science Explains

Elon's Anthropic Deal, The Next AI Monopoly?, "FDA for AI" Panic, Trading the AI Boom

The Week the AI Story Shifted

Hire a team of AI Agents

Explore Related Topics

You're clearly into Cognitive Revolution.