Skip to main content
The TWIML AI Podcast

Relational Foundation Models for Enterprise Data with Jure Leskovec - #768

66 min episode · 3 min read
·

Episode

66 min

Read time

3 min

Topics

Science & Discovery

AI-Generated Summary

Key Takeaways

  • Multi-table vs. Single-table ML: The real performance gap in enterprise ML is not between XGBoost and deep learning on a single table — it's the information lost when collapsing relational databases into flat tables. Aggregating transactions into summary statistics (mean, median, count) discards signal that graph neural networks recover by attending directly over raw multi-table data, producing double-digit accuracy improvements on tasks like fraud detection and churn prediction.
  • Relational Foundation Model (Zero-Shot Prediction): Kumo's RFM-2 performs accurate predictions on unseen databases and tasks without any model training. It uses in-context learning: the system extracts labeled subgraphs from historical data, passes them alongside an unlabeled target entity through a frozen transformer in a single forward pass, and returns a prediction in under half a second — no backpropagation, no hyperparameter tuning, no feature engineering required.
  • Benchmark Performance Numbers: On RealBench and SAP's SALT multi-table benchmarks, Kumo's foundation model outperforms the best supervised models by approximately 5% relative accuracy with zero training. Fine-tuning on task-specific data pushes that gain to roughly 12% over state-of-the-art supervised baselines — gains that translate to tens of millions of dollars in revenue impact in production recommender and fraud systems.
  • Production Deployments at Scale: Reddit's advertising models built with Kumo achieved near double-digit increases in click-through rates — gains that typically take entire ML teams a full year to achieve incrementally. DoorDash uses Kumo for restaurant recommendations and notification timing. Coinbase runs fraud detection across the entire Bitcoin blockchain network, demonstrating the system scales to blockchain-sized transaction graphs.
  • Explainability via Differentiable Attention + LLM: Because the relational model is fully differentiable, running it backward reveals which specific tables, columns, and cells received attention during prediction. An LLM then converts this saliency map into human-readable text explanations. This approach produces more granular explanations than tree-based models, which only rank engineered features — features that may themselves encode incomplete or biased assumptions about the data.

What It Covers

Jure Leskovec, Stanford professor and Kumo AI cofounder, presents relational deep learning as a fundamental shift in enterprise ML — moving from single-table feature engineering to multi-table graph-based neural networks, culminating in a foundation model that makes accurate predictions on any database without model training.

Key Questions Answered

  • Multi-table vs. Single-table ML: The real performance gap in enterprise ML is not between XGBoost and deep learning on a single table — it's the information lost when collapsing relational databases into flat tables. Aggregating transactions into summary statistics (mean, median, count) discards signal that graph neural networks recover by attending directly over raw multi-table data, producing double-digit accuracy improvements on tasks like fraud detection and churn prediction.
  • Relational Foundation Model (Zero-Shot Prediction): Kumo's RFM-2 performs accurate predictions on unseen databases and tasks without any model training. It uses in-context learning: the system extracts labeled subgraphs from historical data, passes them alongside an unlabeled target entity through a frozen transformer in a single forward pass, and returns a prediction in under half a second — no backpropagation, no hyperparameter tuning, no feature engineering required.
  • Benchmark Performance Numbers: On RealBench and SAP's SALT multi-table benchmarks, Kumo's foundation model outperforms the best supervised models by approximately 5% relative accuracy with zero training. Fine-tuning on task-specific data pushes that gain to roughly 12% over state-of-the-art supervised baselines — gains that translate to tens of millions of dollars in revenue impact in production recommender and fraud systems.
  • Production Deployments at Scale: Reddit's advertising models built with Kumo achieved near double-digit increases in click-through rates — gains that typically take entire ML teams a full year to achieve incrementally. DoorDash uses Kumo for restaurant recommendations and notification timing. Coinbase runs fraud detection across the entire Bitcoin blockchain network, demonstrating the system scales to blockchain-sized transaction graphs.
  • Explainability via Differentiable Attention + LLM: Because the relational model is fully differentiable, running it backward reveals which specific tables, columns, and cells received attention during prediction. An LLM then converts this saliency map into human-readable text explanations. This approach produces more granular explanations than tree-based models, which only rank engineered features — features that may themselves encode incomplete or biased assumptions about the data.
  • Agent-Friendly API Design: When coding agents attempt to build ML pipelines from scratch using PyTorch, they produce thousands of lines of code with subtle data science errors — such as information leakage from incorrect time boundaries. Wrapping Kumo's capabilities into a high-level API reduces the same task to roughly 50 lines of error-free code. Shorter code chains mean fewer agent reasoning steps and dramatically lower failure rates in autonomous ML workflows.

Notable Moment

When pressed on whether the zero-shot foundation model claim was plausible, Leskovec acknowledged it sounds outlandish — then revealed that on completely held-out databases with tasks the model had never encountered, it still outperformed supervised models built by data scientists over several weeks of dedicated feature engineering and tuning.

Know someone who'd find this useful?

You just read a 3-minute summary of a 63-minute episode.

Get The TWIML AI Podcast summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Keep Reading

More from The TWIML AI Podcast

We summarize every new episode. Want them in your inbox?

Similar Episodes

Related episodes from other podcasts

Explore Related Topics

This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.

You're clearly into The TWIML AI Podcast.

Every Monday, we deliver AI summaries of the latest episodes from The TWIML AI Podcast and 192+ other podcasts. Free for up to 3 shows.

Start My Monday Digest

No credit card · Unsubscribe anytime