Some thoughts on the Sutton interview
Episode
11 min
Read time
2 min
Topics
Productivity, Startups, Software Development
AI-Generated Summary
Key Takeaways
- ✓Compute efficiency critique: LLMs spend most compute during deployment without learning anything, only learning during training on tens of thousands of years of human experience data inefficiently.
- ✓Imitation learning as foundation: Pretrained LLMs serve as essential priors for reinforcement learning, similar to how AlphaGo used human games before AlphaZero bootstrapped from scratch to superhuman performance.
- ✓Continual learning gap: Current LLMs learn approximately one bit per episode of tens of thousands of tokens during RL, while animals extract maximum signal continuously from environmental observations.
What It Covers
Dwarkesh reflects on Richard Sutton's perspective that current LLMs waste compute during deployment without learning, requiring new architectures for continual learning and true intelligence.
Key Questions Answered
- •Compute efficiency critique: LLMs spend most compute during deployment without learning anything, only learning during training on tens of thousands of years of human experience data inefficiently.
- •Imitation learning as foundation: Pretrained LLMs serve as essential priors for reinforcement learning, similar to how AlphaGo used human games before AlphaZero bootstrapped from scratch to superhuman performance.
- •Continual learning gap: Current LLMs learn approximately one bit per episode of tens of thousands of tokens during RL, while animals extract maximum signal continuously from environmental observations.
Notable Moment
Dwarkesh compares pretraining data to fossil fuels as non-renewable but essential intermediaries, arguing civilization needed them to reach solar panels despite not being the final solution.
You just read a 3-minute summary of a 8-minute episode.
Get Dwarkesh Podcast summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Dwarkesh Podcast
Alex Imas and Phil Trammell – What remains scarce after AGI?
Jun 4 · 76 min
Invest Like the Best with Patrick O'Shaughnessy
Dan Loeb - Lessons from 30 Years of Investing - [Invest Like the Best, EP.475]
May 28
More from Dwarkesh Podcast
Reiner Pope – Chip design from the bottom up
May 22 · 80 min
Cognitive Revolution
All Compute Is Food: Palisade's Jeffrey Ladish on AI Shutdown Resistance, Self-Replication & Ecology
May 24
More from Dwarkesh Podcast
We summarize every new episode. Want them in your inbox?
Alex Imas and Phil Trammell – What remains scarce after AGI?
Reiner Pope – Chip design from the bottom up
Eric Jang – Building AlphaGo from scratch
David Reich – Why the Bronze Age was an inflection point in human evolution
Reiner Pope – The math behind how LLMs are trained and served
Similar Episodes
Related episodes from other podcasts
Invest Like the Best with Patrick O'Shaughnessy
May 28
Dan Loeb - Lessons from 30 Years of Investing - [Invest Like the Best, EP.475]
Cognitive Revolution
May 24
All Compute Is Food: Palisade's Jeffrey Ladish on AI Shutdown Resistance, Self-Replication & Ecology
Invest Like the Best with Patrick O'Shaughnessy
Apr 28
Paul Tudor Jones - Lessons From 50 Years in Markets - [Invest Like the Best, EP.469]
a16z Podcast
Apr 3
Marc Andreessen on AI Winters and Agent Breakthroughs
a16z Podcast
Mar 17
What's Missing Between LLMs and AGI - Vishal Misra & Martin Casado
Explore Related Topics
Read this week's Startups & Product Podcast Insights — cross-podcast analysis updated weekly.
You're clearly into Dwarkesh Podcast.
Every Monday, we deliver AI summaries of the latest episodes from Dwarkesh Podcast and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime