Mathematical Superintelligence: Harmonic's Vlad Tenev & Tudor Achim on IMO Gold & Theories of Everything
Episode
91 min
Read time
3 min
Topics
Remote Work, Startups, Leadership
AI-Generated Summary
Key Takeaways
- ✓Formal Verification as Trust Layer: Aristotle outputs proofs in Lean, a programming language whose kernel checks every logical step against three minimal axioms — propositional extensionality, quotient soundness, and the axiom of choice. This means correctness is computationally certified without human peer review. As AI generates increasingly long proofs (potentially thousands of pages), formal verification becomes the only scalable trust mechanism, making informal output verification practically impossible at frontier capability levels.
- ✓Lean's Three-Axiom Foundation: All of mathematics, computer science, physics modeling, economics, and statistics can be derived from just three axioms in Lean. The axiom of choice alone states that a non-empty set has a selectable element. Lean's kernel is intentionally small and thoroughly vetted, meaning the surface area requiring human trust is minimal. Mathlib, the open-source repository built on Lean, functions as a computationally certified, searchable encyclopedia of all formalized mathematical knowledge.
- ✓Aristotle's Three-Component Architecture: The system combines a Monte Carlo tree search driven by language models (exploring high-level proof paths, not just grinding small steps), an informal lemma-guessing module that proposes candidate waypoints between proof start and end points (functioning as context management rather than reliable reasoning), and a geometry module modeled on DeepMind's AlphaGeometry. The informal module makes large quantities of errors — its value is generating diverse candidates, not accuracy.
- ✓Formal vs. Informal: The Debate Is Settled: DeepMind's AlphaProof used formal methods for its 2024 silver medal result, then shifted to informal for 2025. OpenAI used informal methods for its 2025 gold result. Harmonic used formal methods for gold. The founders argue that as AI-generated proofs grow longer, informal outputs become unverifiable by humans or other systems. The computational cost of verification must not scale linearly with proof complexity — only formal methods solve this.
- ✓Software Verification as the Next Frontier: The same formal reasoning pipeline that proves mathematical theorems applies directly to software correctness. Aristotle API users are already checking cryptography implementations for collision vulnerabilities and verifying autopilot controller stability. As AI generates million-line codebases autonomously — Cursor reportedly generated a Chromium-compatible browser at 1.5 million lines — Python and Java become inadequate because they optimize for human readability rather than machine-verifiable correctness.
What It Covers
Harmonic co-founders Vlad Tenev and Tudor Achim explain how their AI system Aristotle achieved IMO gold medal performance in 2025 using formally verified proofs in Lean, why formal verification beats informal reasoning at scale, and how mathematical superintelligence could eliminate intellectual bottlenecks across science, software, and engineering by 2030.
Key Questions Answered
- •Formal Verification as Trust Layer: Aristotle outputs proofs in Lean, a programming language whose kernel checks every logical step against three minimal axioms — propositional extensionality, quotient soundness, and the axiom of choice. This means correctness is computationally certified without human peer review. As AI generates increasingly long proofs (potentially thousands of pages), formal verification becomes the only scalable trust mechanism, making informal output verification practically impossible at frontier capability levels.
- •Lean's Three-Axiom Foundation: All of mathematics, computer science, physics modeling, economics, and statistics can be derived from just three axioms in Lean. The axiom of choice alone states that a non-empty set has a selectable element. Lean's kernel is intentionally small and thoroughly vetted, meaning the surface area requiring human trust is minimal. Mathlib, the open-source repository built on Lean, functions as a computationally certified, searchable encyclopedia of all formalized mathematical knowledge.
- •Aristotle's Three-Component Architecture: The system combines a Monte Carlo tree search driven by language models (exploring high-level proof paths, not just grinding small steps), an informal lemma-guessing module that proposes candidate waypoints between proof start and end points (functioning as context management rather than reliable reasoning), and a geometry module modeled on DeepMind's AlphaGeometry. The informal module makes large quantities of errors — its value is generating diverse candidates, not accuracy.
- •Formal vs. Informal: The Debate Is Settled: DeepMind's AlphaProof used formal methods for its 2024 silver medal result, then shifted to informal for 2025. OpenAI used informal methods for its 2025 gold result. Harmonic used formal methods for gold. The founders argue that as AI-generated proofs grow longer, informal outputs become unverifiable by humans or other systems. The computational cost of verification must not scale linearly with proof complexity — only formal methods solve this.
- •Software Verification as the Next Frontier: The same formal reasoning pipeline that proves mathematical theorems applies directly to software correctness. Aristotle API users are already checking cryptography implementations for collision vulnerabilities and verifying autopilot controller stability. As AI generates million-line codebases autonomously — Cursor reportedly generated a Chromium-compatible browser at 1.5 million lines — Python and Java become inadequate because they optimize for human readability rather than machine-verifiable correctness.
- •Reinforcement Learning Without Human Taste Panels: Harmonic has run zero mathematician-led AB testing on proof elegance in two years of operation. The optimization target is net present value of future proof computational cost, which naturally penalizes brute-force grinding because short-term grinding degrades long-term capability. Entropy and hallucination are treated as essential features, not bugs — exploration of false paths is what enables discovery of correct ones. The "bitter lesson" principle guides architecture decisions toward scale over hand-crafted priors.
- •Open API as Taste Delegation: Rather than directing Aristotle toward specific unsolved problems internally, Harmonic opened the API publicly and lets revealed user preference determine research priorities. This produced unexpected results: users solved Erdős problems open for 30-40 years, pursued computational learning theory formalizations, and applied the system to graph theory conjectures. The founders argue that a closed lab model where one team selects all problems is both strategically inferior and produces a less desirable world than distributed, community-driven mathematical discovery.
Notable Moment
Tudor described a 2030 scenario where mathematical superintelligence produces multiple self-consistent, competing grand unified theories reconciling quantum mechanics and general relativity — but humanity then faces a data bottleneck, needing exotic high-energy collider experiments to distinguish between them. The host described being momentarily speechless at the concept of theoretical abundance replacing intellectual scarcity as the binding constraint on scientific progress.
You just read a 3-minute summary of a 88-minute episode.
Get Cognitive Revolution summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Cognitive Revolution
Babysitting the Machine: Glean's Rebecca Hinds on the Hidden Human Labor of AI at Work
Jun 10 · 106 min
This Week in Startups
The Drone Company Quietly Taking Over Delivery
May 27
More from Cognitive Revolution
AI in the AM — Week 1 Highlights (June 2026)
Jun 6 · 82 min
Latent Space
AI-Native Healthcare: 100M Doctor Visits, 10–20 Hours Saved, Prior Auth in Minutes — Janie Lee & Chai Asawa, Abridge
May 14
Books, tools, and gear mentioned in this episode
SignalCast may earn commission on purchases via these links. As an Amazon Associate, SignalCast earns from qualifying purchases.
Tools
“As AI generates million-line codebases autonomously — Cursor reportedly generated a Chromium-compatible browser at 1.5 million lines.”
“Granola is listed as a podcast sponsor.”
“Framer is listed as a podcast sponsor.”
“Blitzy is listed as a podcast sponsor.”
“Tasklet is listed as a podcast sponsor.”
“Mathlib, the open-source repository built on Lean, functions as a computationally certified, searchable encyclopedia of all formalized mathematical knowledge.”
“Harmonic co-founders Vlad Tenev and Tudor Achim explain how their AI system Aristotle achieved IMO gold medal performance in 2025 using formally verified proofs in Lean, a programming language whose kernel checks every logical step against three minimal axioms.”
More from Cognitive Revolution
We summarize every new episode. Want them in your inbox?
Babysitting the Machine: Glean's Rebecca Hinds on the Hidden Human Labor of AI at Work
AI in the AM — Week 1 Highlights (June 2026)
Nested Learning: Ali Behrouz on the Quest for Continual Learning & Illusion of AI Architectures
Inside Nathan's Second Brain: Daniel Miessler, Security Expert & Creator of PAI, Audits My AI Setup
Your Biggest Lever: Designing your AI Career for Maximum Impact, with 80,000 Hours founder Ben Todd
Similar Episodes
Related episodes from other podcasts
This Week in Startups
May 27
The Drone Company Quietly Taking Over Delivery
Latent Space
May 14
AI-Native Healthcare: 100M Doctor Visits, 10–20 Hours Saved, Prior Auth in Minutes — Janie Lee & Chai Asawa, Abridge
Latent Space
Apr 27
Physical AI that Moves the World — Qasar Younis & Peter Ludwig, Applied Intuition
Odd Lots
Mar 9
Robinhood CEO Vlad Tenev on Tokenization and Prediction Markets for Everything
The Full Ratchet
Feb 2
501. Spotting the Next Big Thing, Why This Cycle Is Different, Acceptable vs Unacceptable Risk, and Why Duration Is a Feature Not a Bug (Jon Callaghan)
Explore Related Topics
This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.
Read this week's Startups & Product Podcast Insights — cross-podcast analysis updated weekly.
You're clearly into Cognitive Revolution.
Every Monday, we deliver AI summaries of the latest episodes from Cognitive Revolution and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime