#330 Sebastian Risi: Why AI Should Be Grown, Not Trained
Episode
59 min
Read time
2 min
Topics
Health & Wellness, Artificial Intelligence, Software Development
AI-Generated Summary
Key Takeaways
- ✓Hebbian Plasticity over Fixed Weights: Networks trained with local Hebbian learning rules — where connection strength changes based on how often paired neurons fire together — demonstrate real-time adaptability that static networks lack. A quadrupedal robot controlled by such a network continues functioning after losing a leg, despite never encountering that scenario during training, because weights continuously update throughout operation.
- ✓Evolutionary Model Merging: Rather than training new models from scratch, evolution can identify which layers from existing pretrained models to combine. Sakana AI demonstrated this by merging a Japanese-language model with a math-specialized model, producing a single model competent in both domains — a scalable strategy for capability expansion without full retraining cycles.
- ✓LLMs as Mutation Operators: Evolutionary search becomes significantly more powerful when a language model replaces hand-coded mutation functions. In circle-packing optimization, an LLM generates solution variants, fitness scores rank them, and the process iterates — navigating solution spaces that gradient descent cannot traverse because no differentiable objective exists across discrete or code-based representations.
- ✓Quality Diversity over Single-Objective Optimization: Optimizing purely for fitness score causes evolutionary systems to get trapped — a network learning a T-maze reward task performs worse than random chance by consistently choosing the smaller reward. Researchers should apply quality diversity algorithms that simultaneously reward exploration breadth and solution quality, preventing premature convergence to locally decent but globally poor strategies.
- ✓Co-evolving Agent and Environment: Training agents against static environments produces brittle specialists. The POET algorithm and its successors evolve terrain difficulty alongside the agent, starting simple and progressively increasing complexity. This curriculum approach enables bipedal robots to eventually navigate obstacle courses they could never learn directly — a principle now extendable using LLMs to generate Unity environments via code.
What It Covers
Sebastian Risi, researcher at Sakana AI, explains neuroevolution — using evolutionary algorithms instead of gradient descent to optimize neural networks — and explores biologically inspired approaches including plastic networks, growing architectures, and combining large language models with evolutionary search to advance AI capabilities.
Key Questions Answered
- •Hebbian Plasticity over Fixed Weights: Networks trained with local Hebbian learning rules — where connection strength changes based on how often paired neurons fire together — demonstrate real-time adaptability that static networks lack. A quadrupedal robot controlled by such a network continues functioning after losing a leg, despite never encountering that scenario during training, because weights continuously update throughout operation.
- •Evolutionary Model Merging: Rather than training new models from scratch, evolution can identify which layers from existing pretrained models to combine. Sakana AI demonstrated this by merging a Japanese-language model with a math-specialized model, producing a single model competent in both domains — a scalable strategy for capability expansion without full retraining cycles.
- •LLMs as Mutation Operators: Evolutionary search becomes significantly more powerful when a language model replaces hand-coded mutation functions. In circle-packing optimization, an LLM generates solution variants, fitness scores rank them, and the process iterates — navigating solution spaces that gradient descent cannot traverse because no differentiable objective exists across discrete or code-based representations.
- •Quality Diversity over Single-Objective Optimization: Optimizing purely for fitness score causes evolutionary systems to get trapped — a network learning a T-maze reward task performs worse than random chance by consistently choosing the smaller reward. Researchers should apply quality diversity algorithms that simultaneously reward exploration breadth and solution quality, preventing premature convergence to locally decent but globally poor strategies.
- •Co-evolving Agent and Environment: Training agents against static environments produces brittle specialists. The POET algorithm and its successors evolve terrain difficulty alongside the agent, starting simple and progressively increasing complexity. This curriculum approach enables bipedal robots to eventually navigate obstacle courses they could never learn directly — a principle now extendable using LLMs to generate Unity environments via code.
Notable Moment
Risi describes a counterintuitive failure mode in plasticity research: a network trained to adapt in a T-maze consistently learned the worst possible strategy — always choosing the smaller reward — scoring below random chance, yet appearing closer to the correct solution than a network that ignored rewards entirely.
You just read a 3-minute summary of a 56-minute episode.
Get Eye on AI summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Eye on AI
AI Agents Are Failing and It's Almost Never the Model's Fault | Alberto Pan, Denodo
Jul 2 · 41 min
The Jordan Harbinger Show
1334: Justin Garcia | Why We Live, Cheat, Break, and Die for Love
May 28
More from Eye on AI
How Modern Science Got Consciousness Wrong From the Start | Philip Goff
Jun 29 · 61 min
Machine Learning Street Talk
When AI Discovers The Next Transformer - Robert Lange (Sakana)
Mar 13
More from Eye on AI
We summarize every new episode. Want them in your inbox?
AI Agents Are Failing and It's Almost Never the Model's Fault | Alberto Pan, Denodo
How Modern Science Got Consciousness Wrong From the Start | Philip Goff
AI Is Reading 15 Million X-Rays a Year With No Human in the Loop | Prashant Warier, Qure.ai
Only 12% of Companies Generate Value From AI. Here's What They're Doing | Sanjeev Vohra, Genpact
India Is Becoming an Architect of the Global AI Order | Ivana Bartoletti of Wipro
Similar Episodes
Related episodes from other podcasts
The Jordan Harbinger Show
May 28
1334: Justin Garcia | Why We Live, Cheat, Break, and Die for Love
Machine Learning Street Talk
Mar 13
When AI Discovers The Next Transformer - Robert Lange (Sakana)
Odd Lots
Jun 26
Rory Johnston on Why His $200 Oil Prediction Didn't Turn Out Right
The School of Greatness
Jun 26
Break the Cycle of Toxic Love for Good | Sheleana Aiyana
Modern Wisdom
Jun 25
Why You Don’t Feel Loved (even when you are) - Sonja Lyubomirsky - #1115
Explore Related Topics
This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.
Read this week's Health & Longevity Podcast Insights — cross-podcast analysis updated weekly.
You're clearly into Eye on AI.
Every Monday, we deliver AI summaries of the latest episodes from Eye on AI and 192+ other podcasts. Free for one show.
Start My Monday DigestNo credit card · Unsubscribe anytime