Andrej Karpathy — AGI is still a decade away
Episode
145 min
Read time
2 min
Topics
Investing, Fundraising & VC, Artificial Intelligence
AI-Generated Summary
Key Takeaways
- ✓Reinforcement Learning Limitations: Current RL assigns credit uniformly across entire solution trajectories based on final outcomes, upweighting every token including wrong paths if the answer is correct. This creates high-variance estimators that waste computational work by sucking supervision through a straw from single reward signals across minutes of rollout.
- ✓Model Collapse Problem: LLMs generate outputs from collapsed distributions with low entropy, producing only three jokes when prompted repeatedly. Training on synthetic self-generated data causes further collapse because models cannot maintain the diversity and entropy needed for robust learning, similar to how humans become more rigid with age and repetitive thought patterns.
- ✓Cognitive Core Size: Optimal intelligence cores could operate at one billion parameters or less, compared to current trillion-parameter models. Most model size stores memorized internet facts rather than cognitive algorithms. Future systems should separate knowledge retrieval from reasoning capabilities, with models knowing what they don't know and looking up information as needed.
- ✓Pre-training Data Quality: Average pre-training examples are garbage, not Wall Street Journal articles but stock tickers and internet slop. Compression ratios show LLAMA 3 stores 0.07 bits per token across 15 trillion tokens, while context windows store 320 kilobytes per token, a 35 million fold difference explaining why in-context learning feels more intelligent than pre-trained knowledge.
- ✓Automation Progression Path: Call center work will automate before radiology because tasks are repetitive, ten-minute interactions with closed databases rather than messy multi-surface jobs. Expect autonomy sliders where AIs handle 80% of volume while delegating 20% to human supervisors managing teams of five AI agents, not instant full replacement across knowledge work.
What It Covers
Andrej Karpathy explains why AGI development will take a decade, not a year, discussing current limitations in continual learning, reinforcement learning's fundamental flaws, model collapse issues, and why coding automation succeeds while other knowledge work automation struggles despite similar text-based interfaces.
Key Questions Answered
- •Reinforcement Learning Limitations: Current RL assigns credit uniformly across entire solution trajectories based on final outcomes, upweighting every token including wrong paths if the answer is correct. This creates high-variance estimators that waste computational work by sucking supervision through a straw from single reward signals across minutes of rollout.
- •Model Collapse Problem: LLMs generate outputs from collapsed distributions with low entropy, producing only three jokes when prompted repeatedly. Training on synthetic self-generated data causes further collapse because models cannot maintain the diversity and entropy needed for robust learning, similar to how humans become more rigid with age and repetitive thought patterns.
- •Cognitive Core Size: Optimal intelligence cores could operate at one billion parameters or less, compared to current trillion-parameter models. Most model size stores memorized internet facts rather than cognitive algorithms. Future systems should separate knowledge retrieval from reasoning capabilities, with models knowing what they don't know and looking up information as needed.
- •Pre-training Data Quality: Average pre-training examples are garbage, not Wall Street Journal articles but stock tickers and internet slop. Compression ratios show LLAMA 3 stores 0.07 bits per token across 15 trillion tokens, while context windows store 320 kilobytes per token, a 35 million fold difference explaining why in-context learning feels more intelligent than pre-trained knowledge.
- •Automation Progression Path: Call center work will automate before radiology because tasks are repetitive, ten-minute interactions with closed databases rather than messy multi-surface jobs. Expect autonomy sliders where AIs handle 80% of volume while delegating 20% to human supervisors managing teams of five AI agents, not instant full replacement across knowledge work.
Notable Moment
Karpathy reveals that during Nanochat development, coding assistants constantly misunderstood his custom implementations, trying to force deprecated APIs and production boilerplate. The models kept assuming he used standard PyTorch containers when he wrote custom gradient synchronization, demonstrating how current AI struggles with novel code patterns outside typical internet examples despite appearing capable.
You just read a 3-minute summary of a 142-minute episode.
Get Dwarkesh Podcast summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Dwarkesh Podcast
Alex Imas and Phil Trammell – What remains scarce after AGI?
Jun 4 · 76 min
Masters of Scale
The race no one can win: AI’s anti-human crisis, with Aza Raskin
Jun 2
More from Dwarkesh Podcast
Reiner Pope – Chip design from the bottom up
May 22 · 80 min
The AI Breakdown
Anthropic Just Reset AI Expectations
May 21
More from Dwarkesh Podcast
We summarize every new episode. Want them in your inbox?
Alex Imas and Phil Trammell – What remains scarce after AGI?
Reiner Pope – Chip design from the bottom up
Eric Jang – Building AlphaGo from scratch
David Reich – Why the Bronze Age was an inflection point in human evolution
Reiner Pope – The math behind how LLMs are trained and served
Similar Episodes
Related episodes from other podcasts
Masters of Scale
Jun 2
The race no one can win: AI’s anti-human crisis, with Aza Raskin
The AI Breakdown
May 21
Anthropic Just Reset AI Expectations
The Diary of a CEO
May 18
Fatty Liver Expert: Your Liver Is Filling With Fat Right Now - Dr David Unwin
Lenny's Podcast
Apr 26
Snapchat CEO: Why distribution has become the most important moat | Evan Spiegel
Invest Like the Best with Patrick O'Shaughnessy
Apr 21
Alex Karnal - The Trillion-Dollar Health Revolution - [Invest Like the Best, EP.467]
Explore Related Topics
Read this week's Investing & Markets Podcast Insights — cross-podcast analysis updated weekly.
You're clearly into Dwarkesh Podcast.
Every Monday, we deliver AI summaries of the latest episodes from Dwarkesh Podcast and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime