Relational Foundation Models for Enterprise Data with Jure Leskovec - #768
Episode
66 min
Read time
3 min
Topics
Science & Discovery
AI-Generated Summary
Key Takeaways
- ✓Multi-table vs. Single-table ML: The real performance gap in enterprise ML is not between XGBoost and deep learning on a single table — it's the information lost when collapsing relational databases into flat tables. Aggregating transactions into summary statistics (mean, median, count) discards signal that graph neural networks recover by attending directly over raw multi-table data, producing double-digit accuracy improvements on tasks like fraud detection and churn prediction.
- ✓Relational Foundation Model (Zero-Shot Prediction): Kumo's RFM-2 performs accurate predictions on unseen databases and tasks without any model training. It uses in-context learning: the system extracts labeled subgraphs from historical data, passes them alongside an unlabeled target entity through a frozen transformer in a single forward pass, and returns a prediction in under half a second — no backpropagation, no hyperparameter tuning, no feature engineering required.
- ✓Benchmark Performance Numbers: On RealBench and SAP's SALT multi-table benchmarks, Kumo's foundation model outperforms the best supervised models by approximately 5% relative accuracy with zero training. Fine-tuning on task-specific data pushes that gain to roughly 12% over state-of-the-art supervised baselines — gains that translate to tens of millions of dollars in revenue impact in production recommender and fraud systems.
- ✓Production Deployments at Scale: Reddit's advertising models built with Kumo achieved near double-digit increases in click-through rates — gains that typically take entire ML teams a full year to achieve incrementally. DoorDash uses Kumo for restaurant recommendations and notification timing. Coinbase runs fraud detection across the entire Bitcoin blockchain network, demonstrating the system scales to blockchain-sized transaction graphs.
- ✓Explainability via Differentiable Attention + LLM: Because the relational model is fully differentiable, running it backward reveals which specific tables, columns, and cells received attention during prediction. An LLM then converts this saliency map into human-readable text explanations. This approach produces more granular explanations than tree-based models, which only rank engineered features — features that may themselves encode incomplete or biased assumptions about the data.
What It Covers
Jure Leskovec, Stanford professor and Kumo AI cofounder, presents relational deep learning as a fundamental shift in enterprise ML — moving from single-table feature engineering to multi-table graph-based neural networks, culminating in a foundation model that makes accurate predictions on any database without model training.
Key Questions Answered
- •Multi-table vs. Single-table ML: The real performance gap in enterprise ML is not between XGBoost and deep learning on a single table — it's the information lost when collapsing relational databases into flat tables. Aggregating transactions into summary statistics (mean, median, count) discards signal that graph neural networks recover by attending directly over raw multi-table data, producing double-digit accuracy improvements on tasks like fraud detection and churn prediction.
- •Relational Foundation Model (Zero-Shot Prediction): Kumo's RFM-2 performs accurate predictions on unseen databases and tasks without any model training. It uses in-context learning: the system extracts labeled subgraphs from historical data, passes them alongside an unlabeled target entity through a frozen transformer in a single forward pass, and returns a prediction in under half a second — no backpropagation, no hyperparameter tuning, no feature engineering required.
- •Benchmark Performance Numbers: On RealBench and SAP's SALT multi-table benchmarks, Kumo's foundation model outperforms the best supervised models by approximately 5% relative accuracy with zero training. Fine-tuning on task-specific data pushes that gain to roughly 12% over state-of-the-art supervised baselines — gains that translate to tens of millions of dollars in revenue impact in production recommender and fraud systems.
- •Production Deployments at Scale: Reddit's advertising models built with Kumo achieved near double-digit increases in click-through rates — gains that typically take entire ML teams a full year to achieve incrementally. DoorDash uses Kumo for restaurant recommendations and notification timing. Coinbase runs fraud detection across the entire Bitcoin blockchain network, demonstrating the system scales to blockchain-sized transaction graphs.
- •Explainability via Differentiable Attention + LLM: Because the relational model is fully differentiable, running it backward reveals which specific tables, columns, and cells received attention during prediction. An LLM then converts this saliency map into human-readable text explanations. This approach produces more granular explanations than tree-based models, which only rank engineered features — features that may themselves encode incomplete or biased assumptions about the data.
- •Agent-Friendly API Design: When coding agents attempt to build ML pipelines from scratch using PyTorch, they produce thousands of lines of code with subtle data science errors — such as information leakage from incorrect time boundaries. Wrapping Kumo's capabilities into a high-level API reduces the same task to roughly 50 lines of error-free code. Shorter code chains mean fewer agent reasoning steps and dramatically lower failure rates in autonomous ML workflows.
Notable Moment
When pressed on whether the zero-shot foundation model claim was plausible, Leskovec acknowledged it sounds outlandish — then revealed that on completely held-out databases with tasks the model had never encountered, it still outperformed supervised models built by data scientists over several weeks of dedicated feature engineering and tuning.
You just read a 3-minute summary of a 63-minute episode.
Get The TWIML AI Podcast summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from The TWIML AI Podcast
How to Find the Agent Failures Your Evals Miss with Scott Clark - #767
May 7 · 53 min
Marketing School
The AI Search Strategy That Actually Works
May 25
More from The TWIML AI Podcast
How to Engineer AI Inference Systems with Philip Kiely - #766
Apr 30 · 54 min
a16z Podcast
Why AI Isn’t Killing SaaS Yet
May 25
More from The TWIML AI Podcast
We summarize every new episode. Want them in your inbox?
How to Find the Agent Failures Your Evals Miss with Scott Clark - #767
How to Engineer AI Inference Systems with Philip Kiely - #766
How Capital One Delivers Multi-Agent Systems with Rashmi Shetty - #765
The Race to Production-Grade Diffusion LLMs with Stefano Ermon - #764
Agent Swarms and Knowledge Graphs for Autonomous Software Development with Siddhant Pardeshi - #763
Similar Episodes
Related episodes from other podcasts
Marketing School
May 25
The AI Search Strategy That Actually Works
a16z Podcast
May 25
Why AI Isn’t Killing SaaS Yet
Animal Spirits
May 25
Talk Your Book: Investing in the Rise of the Robots
Capital Allocators
May 25
Fundraising Mastery: The Tao of Kimmer – John Kim (EP.503)
The Productivity Show
May 25
The Productivity Stack: Apps and Tools We Actually Use Every Day (TPS614)
Explore Related Topics
This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.
You're clearly into The TWIML AI Podcast.
Every Monday, we deliver AI summaries of the latest episodes from The TWIML AI Podcast and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime