Mathematical Superintelligence: Harmonic's Vlad Tenev & Tudor Achim on IMO Gold & Theories of Everything
Episode
91 min
Read time
3 min
AI-Generated Summary
Key Takeaways
- ✓Formal Verification as Trust Layer: Aristotle outputs proofs in Lean, a programming language whose kernel checks every logical step against three minimal axioms — propositional extensionality, quotient soundness, and the axiom of choice. This means correctness is computationally certified without human peer review. As AI generates increasingly long proofs (potentially thousands of pages), formal verification becomes the only scalable trust mechanism, making informal output verification practically impossible at frontier capability levels.
- ✓Lean's Three-Axiom Foundation: All of mathematics, computer science, physics modeling, economics, and statistics can be derived from just three axioms in Lean. The axiom of choice alone states that a non-empty set has a selectable element. Lean's kernel is intentionally small and thoroughly vetted, meaning the surface area requiring human trust is minimal. Mathlib, the open-source repository built on Lean, functions as a computationally certified, searchable encyclopedia of all formalized mathematical knowledge.
- ✓Aristotle's Three-Component Architecture: The system combines a Monte Carlo tree search driven by language models (exploring high-level proof paths, not just grinding small steps), an informal lemma-guessing module that proposes candidate waypoints between proof start and end points (functioning as context management rather than reliable reasoning), and a geometry module modeled on DeepMind's AlphaGeometry. The informal module makes large quantities of errors — its value is generating diverse candidates, not accuracy.
- ✓Formal vs. Informal: The Debate Is Settled: DeepMind's AlphaProof used formal methods for its 2024 silver medal result, then shifted to informal for 2025. OpenAI used informal methods for its 2025 gold result. Harmonic used formal methods for gold. The founders argue that as AI-generated proofs grow longer, informal outputs become unverifiable by humans or other systems. The computational cost of verification must not scale linearly with proof complexity — only formal methods solve this.
- ✓Software Verification as the Next Frontier: The same formal reasoning pipeline that proves mathematical theorems applies directly to software correctness. Aristotle API users are already checking cryptography implementations for collision vulnerabilities and verifying autopilot controller stability. As AI generates million-line codebases autonomously — Cursor reportedly generated a Chromium-compatible browser at 1.5 million lines — Python and Java become inadequate because they optimize for human readability rather than machine-verifiable correctness.
What It Covers
Harmonic co-founders Vlad Tenev and Tudor Achim explain how their AI system Aristotle achieved IMO gold medal performance in 2025 using formally verified proofs in Lean, why formal verification beats informal reasoning at scale, and how mathematical superintelligence could eliminate intellectual bottlenecks across science, software, and engineering by 2030.
Key Questions Answered
- •Formal Verification as Trust Layer: Aristotle outputs proofs in Lean, a programming language whose kernel checks every logical step against three minimal axioms — propositional extensionality, quotient soundness, and the axiom of choice. This means correctness is computationally certified without human peer review. As AI generates increasingly long proofs (potentially thousands of pages), formal verification becomes the only scalable trust mechanism, making informal output verification practically impossible at frontier capability levels.
- •Lean's Three-Axiom Foundation: All of mathematics, computer science, physics modeling, economics, and statistics can be derived from just three axioms in Lean. The axiom of choice alone states that a non-empty set has a selectable element. Lean's kernel is intentionally small and thoroughly vetted, meaning the surface area requiring human trust is minimal. Mathlib, the open-source repository built on Lean, functions as a computationally certified, searchable encyclopedia of all formalized mathematical knowledge.
- •Aristotle's Three-Component Architecture: The system combines a Monte Carlo tree search driven by language models (exploring high-level proof paths, not just grinding small steps), an informal lemma-guessing module that proposes candidate waypoints between proof start and end points (functioning as context management rather than reliable reasoning), and a geometry module modeled on DeepMind's AlphaGeometry. The informal module makes large quantities of errors — its value is generating diverse candidates, not accuracy.
- •Formal vs. Informal: The Debate Is Settled: DeepMind's AlphaProof used formal methods for its 2024 silver medal result, then shifted to informal for 2025. OpenAI used informal methods for its 2025 gold result. Harmonic used formal methods for gold. The founders argue that as AI-generated proofs grow longer, informal outputs become unverifiable by humans or other systems. The computational cost of verification must not scale linearly with proof complexity — only formal methods solve this.
- •Software Verification as the Next Frontier: The same formal reasoning pipeline that proves mathematical theorems applies directly to software correctness. Aristotle API users are already checking cryptography implementations for collision vulnerabilities and verifying autopilot controller stability. As AI generates million-line codebases autonomously — Cursor reportedly generated a Chromium-compatible browser at 1.5 million lines — Python and Java become inadequate because they optimize for human readability rather than machine-verifiable correctness.
- •Reinforcement Learning Without Human Taste Panels: Harmonic has run zero mathematician-led AB testing on proof elegance in two years of operation. The optimization target is net present value of future proof computational cost, which naturally penalizes brute-force grinding because short-term grinding degrades long-term capability. Entropy and hallucination are treated as essential features, not bugs — exploration of false paths is what enables discovery of correct ones. The "bitter lesson" principle guides architecture decisions toward scale over hand-crafted priors.
- •Open API as Taste Delegation: Rather than directing Aristotle toward specific unsolved problems internally, Harmonic opened the API publicly and lets revealed user preference determine research priorities. This produced unexpected results: users solved Erdős problems open for 30-40 years, pursued computational learning theory formalizations, and applied the system to graph theory conjectures. The founders argue that a closed lab model where one team selects all problems is both strategically inferior and produces a less desirable world than distributed, community-driven mathematical discovery.
Notable Moment
Tudor described a 2030 scenario where mathematical superintelligence produces multiple self-consistent, competing grand unified theories reconciling quantum mechanics and general relativity — but humanity then faces a data bottleneck, needing exotic high-energy collider experiments to distinguish between them. The host described being momentarily speechless at the concept of theoretical abundance replacing intellectual scarcity as the binding constraint on scientific progress.
You just read a 3-minute summary of a 88-minute episode.
Get Cognitive Revolution summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Cognitive Revolution
Does Learning Require Feeling? Cameron Berg on the latest AI Consciousness & Welfare Research
Apr 23 · 213 min
Masters of Scale
Possible: Netflix co-founder Reed Hastings: stories, schools, superpowers
Apr 25
More from Cognitive Revolution
Vibe-Coding an Attention Firewall, w/ Steve Newman, creator of The Curve
Apr 19 · 129 min
This Week in Startups
The Defense Tech Startup YC Kicked Out of a Meeting is Now Arming America | E2280
Apr 25
More from Cognitive Revolution
We summarize every new episode. Want them in your inbox?
Does Learning Require Feeling? Cameron Berg on the latest AI Consciousness & Welfare Research
Vibe-Coding an Attention Firewall, w/ Steve Newman, creator of The Curve
Welcome to AI in the AM: RL for EE, Oversight w/out Nationalization, & the first AI-Run Retail Store
It's Crunch Time: Ajeya Cotra on RSI & AI-Powered AI Safety Work, from the 80,000 Hours Podcast
Calm AI for Crazy Days: Inside Granola's Design Philosophy, with co-founder Sam Stephenson
Similar Episodes
Related episodes from other podcasts
Masters of Scale
Apr 25
Possible: Netflix co-founder Reed Hastings: stories, schools, superpowers
This Week in Startups
Apr 25
The Defense Tech Startup YC Kicked Out of a Meeting is Now Arming America | E2280
Marketplace
Apr 24
When does AI become a spending suck?
My First Million
Apr 24
This guy built a $1B+ brand in 3 years. The product? You'd never guess
Eye on AI
Apr 24
#338 Amith Singhee: Can India Catch Up in AI? IBM's Amith Singhee on What It Will Take
This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.
You're clearly into Cognitive Revolution.
Every Monday, we deliver AI summaries of the latest episodes from Cognitive Revolution and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime