When AI Discovers The Next Transformer - Robert Lange (Sakana)
Episode
78 min
Read time
3 min
Topics
Productivity, Investing, Fundraising & VC
AI-Generated Summary
Key Takeaways
- ✓Sample Efficiency via Model Ensembling: Shinka Evolve reduces LLM query costs by running multiple frontier models (GPT, Gemini, Grok) simultaneously and using a Upper Confidence Bound bandit algorithm to adaptively route each program mutation to the best-performing model. This approach achieves competitive circle packing results in fewer than 200 evaluations, compared to the thousands typically required by similar systems like AlphaEvolve.
- ✓The "Problem-Problem" Bottleneck: Current evolutionary LLM systems treat the problem as fixed, but breakthroughs often require first inventing a surrogate or reformulated problem. Shinka Evolve demonstrated this when using a slightly relaxed circle-overlap constraint as a proxy problem accelerated convergence. Building systems that automatically generate and evolve problem formulations alongside solutions represents the next critical frontier for AI-driven discovery.
- ✓Stepping Stones Over Direct Optimization: Drawing from Kenneth Stanley's open-endedness research, Lange argues that starting from an impoverished or minimal initial solution generates more diversity and ultimately better results than starting from a highly optimized one. Systems that accumulate diverse intermediate solutions — even seemingly unproductive ones — build the combinatorial foundation needed for genuine breakthroughs, mirroring how biological evolution produces complexity through non-directed exploration.
- ✓Crossover and Full-Rewrite Mutations Add Diversity: Beyond diff-based patches used in AlphaEvolve, Shinka Evolve introduces two additional mutation operators: complete program rewrites and crossover between two parent programs. Sampling two parent programs and prompting the LLM to produce a complementary improvement proved especially useful on structured problems. A global "meta scratch pad" summarizes discoveries across the program tree and injects shared insights into subsequent system prompts.
- ✓AI Scientist v2 Implements Falsificationist Loop: Unlike v1's linear template-based execution, AI Scientist v2 runs a parallelizable agentic tree search where the LLM drafts its own experimental setup, executes code, receives numerical feedback, and iteratively refines hypotheses — mirroring Karl Popper's falsificationism. A workshop-level paper produced by the system passed the acceptance threshold at an ICLR workshop, marking the first fully autonomous compute-to-scientific-output pipeline.
What It Covers
Robert Lange from Sakana AI discusses Shinka Evolve, an open-source evolutionary framework that uses multiple LLMs in parallel to discover novel algorithms and scientific solutions. The system improves on AlphaEvolve's approach through model ensembling, UCB-based adaptive model selection, and crossover mutations, achieving state-of-the-art circle packing results in under 200 LLM evaluations.
Key Questions Answered
- •Sample Efficiency via Model Ensembling: Shinka Evolve reduces LLM query costs by running multiple frontier models (GPT, Gemini, Grok) simultaneously and using a Upper Confidence Bound bandit algorithm to adaptively route each program mutation to the best-performing model. This approach achieves competitive circle packing results in fewer than 200 evaluations, compared to the thousands typically required by similar systems like AlphaEvolve.
- •The "Problem-Problem" Bottleneck: Current evolutionary LLM systems treat the problem as fixed, but breakthroughs often require first inventing a surrogate or reformulated problem. Shinka Evolve demonstrated this when using a slightly relaxed circle-overlap constraint as a proxy problem accelerated convergence. Building systems that automatically generate and evolve problem formulations alongside solutions represents the next critical frontier for AI-driven discovery.
- •Stepping Stones Over Direct Optimization: Drawing from Kenneth Stanley's open-endedness research, Lange argues that starting from an impoverished or minimal initial solution generates more diversity and ultimately better results than starting from a highly optimized one. Systems that accumulate diverse intermediate solutions — even seemingly unproductive ones — build the combinatorial foundation needed for genuine breakthroughs, mirroring how biological evolution produces complexity through non-directed exploration.
- •Crossover and Full-Rewrite Mutations Add Diversity: Beyond diff-based patches used in AlphaEvolve, Shinka Evolve introduces two additional mutation operators: complete program rewrites and crossover between two parent programs. Sampling two parent programs and prompting the LLM to produce a complementary improvement proved especially useful on structured problems. A global "meta scratch pad" summarizes discoveries across the program tree and injects shared insights into subsequent system prompts.
- •AI Scientist v2 Implements Falsificationist Loop: Unlike v1's linear template-based execution, AI Scientist v2 runs a parallelizable agentic tree search where the LLM drafts its own experimental setup, executes code, receives numerical feedback, and iteratively refines hypotheses — mirroring Karl Popper's falsificationism. A workshop-level paper produced by the system passed the acceptance threshold at an ICLR workshop, marking the first fully autonomous compute-to-scientific-output pipeline.
- •Verification Remains the Hard Constraint: Generating candidate solutions is computationally easier than rigorously verifying them. LLMs can perform soft verification by latently tracing code execution, but this remains inexact and susceptible to reward hacking. Lange identifies automatic verifier design — systems that both formulate problems and construct their own correctness checkers — as the most critical unsolved challenge before AI-driven science can operate reliably without human oversight.
Notable Moment
Lange describes running Shinka Evolve with a slightly relaxed circle-overlap constraint as a proxy problem, which accelerated convergence. When the system was rerun with exact constraints, it took noticeably longer to reach the same solution quality — demonstrating that surrogate problem design, typically a human insight, could itself become an automated discovery target.
You just read a 3-minute summary of a 75-minute episode.
Get Machine Learning Street Talk summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Machine Learning Street Talk
When AI Decides You're a Threat — Brad Carson
May 31 · 80 min
All-In with Chamath, Jason, Sacks & Friedberg
Four CEOs on the Future of AI: CoreWeave, Perplexity, Mistral, and IREN
Mar 23
More from Machine Learning Street Talk
Intelligence is collective, not artificial — Prof. Michael I. Jordan (UC Berkeley / Inria)
May 21 · 77 min
Eye on AI
#323 David Ha: Why Model Merging Could Be the Next AI Breakthrough
Feb 24
Books, tools, and gear mentioned in this episode
SignalCast may earn commission on purchases via these links. As an Amazon Associate, SignalCast earns from qualifying purchases.
Tools
- Shinka EvolveBy guest
by Sakana AI
“Robert Lange from Sakana AI discusses Shinka Evolve, an open-source evolutionary framework that uses multiple LLMs in parallel to discover novel algorithms and scientific solutions.”
by Sakana AI
“AI Scientist v2 runs a parallelizable agentic tree search where the LLM drafts its own experimental setup, executes code, receives numerical feedback, and iteratively refines hypotheses — mirroring Karl Popper's falsificationism.”
“The system improves on AlphaEvolve's approach through model ensembling, UCB-based adaptive model selection, and crossover mutations, achieving state-of-the-art circle packing results in under 200 LLM evaluations.”
More from Machine Learning Street Talk
We summarize every new episode. Want them in your inbox?
When AI Decides You're a Threat — Brad Carson
Intelligence is collective, not artificial — Prof. Michael I. Jordan (UC Berkeley / Inria)
The AI Models Smart Enough to Know They're Cheating — Beth Barnes & David Rein [METR]
"Vibe Coding is a Slot Machine" - Jeremy Howard
Evolution "Doesn't Need" Mutation - Blaise Agüera y Arcas
Similar Episodes
Related episodes from other podcasts
All-In with Chamath, Jason, Sacks & Friedberg
Mar 23
Four CEOs on the Future of AI: CoreWeave, Perplexity, Mistral, and IREN
Eye on AI
Feb 24
#323 David Ha: Why Model Merging Could Be the Next AI Breakthrough
All-In with Chamath, Jason, Sacks & Friedberg
Jun 13
Anthropic's Fable Backlash, Nationalizing AI, Inflation Heats Up & California's Broken Elections
No Priors: Artificial Intelligence | Technology | Startups
Jun 10
Biohub: The Future of Biology is Open-Source with Co-Founders Mark Zuckerberg, Priscilla Chan, and Head of Science Alex Rives
Latent Space
Jun 2
GitHub's plan for Agents — Kyle Daigle, GitHub
Explore Related Topics
This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.
Read this week's Investing & Markets Podcast Insights — cross-podcast analysis updated weekly.
You're clearly into Machine Learning Street Talk.
Every Monday, we deliver AI summaries of the latest episodes from Machine Learning Street Talk and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime