Mathematical Superintelligence: Harmonic's Vlad Tenev & Tudor Achim on IMO Gold & Theories of Everything

February 18, 2026

91 min episode · 3 min read

Mathematical Superintelligence

Episode

91 min

Read time

3 min

AI-Generated Summary

Published Feb 18, 2026

Key Takeaways

✓Formal Verification as Trust Layer: Aristotle outputs proofs in Lean, a programming language whose kernel checks every logical step against three minimal axioms — propositional extensionality, quotient soundness, and the axiom of choice. This means correctness is computationally certified without human peer review. As AI generates increasingly long proofs (potentially thousands of pages), formal verification becomes the only scalable trust mechanism, making informal output verification practically impossible at frontier capability levels.
✓Lean's Three-Axiom Foundation: All of mathematics, computer science, physics modeling, economics, and statistics can be derived from just three axioms in Lean. The axiom of choice alone states that a non-empty set has a selectable element. Lean's kernel is intentionally small and thoroughly vetted, meaning the surface area requiring human trust is minimal. Mathlib, the open-source repository built on Lean, functions as a computationally certified, searchable encyclopedia of all formalized mathematical knowledge.
✓Aristotle's Three-Component Architecture: The system combines a Monte Carlo tree search driven by language models (exploring high-level proof paths, not just grinding small steps), an informal lemma-guessing module that proposes candidate waypoints between proof start and end points (functioning as context management rather than reliable reasoning), and a geometry module modeled on DeepMind's AlphaGeometry. The informal module makes large quantities of errors — its value is generating diverse candidates, not accuracy.
✓Formal vs. Informal: The Debate Is Settled: DeepMind's AlphaProof used formal methods for its 2024 silver medal result, then shifted to informal for 2025. OpenAI used informal methods for its 2025 gold result. Harmonic used formal methods for gold. The founders argue that as AI-generated proofs grow longer, informal outputs become unverifiable by humans or other systems. The computational cost of verification must not scale linearly with proof complexity — only formal methods solve this.
✓Software Verification as the Next Frontier: The same formal reasoning pipeline that proves mathematical theorems applies directly to software correctness. Aristotle API users are already checking cryptography implementations for collision vulnerabilities and verifying autopilot controller stability. As AI generates million-line codebases autonomously — Cursor reportedly generated a Chromium-compatible browser at 1.5 million lines — Python and Java become inadequate because they optimize for human readability rather than machine-verifiable correctness.

What It Covers

Harmonic co-founders Vlad Tenev and Tudor Achim explain how their AI system Aristotle achieved IMO gold medal performance in 2025 using formally verified proofs in Lean, why formal verification beats informal reasoning at scale, and how mathematical superintelligence could eliminate intellectual bottlenecks across science, software, and engineering by 2030.

Key Questions Answered

•Formal Verification as Trust Layer: Aristotle outputs proofs in Lean, a programming language whose kernel checks every logical step against three minimal axioms — propositional extensionality, quotient soundness, and the axiom of choice. This means correctness is computationally certified without human peer review. As AI generates increasingly long proofs (potentially thousands of pages), formal verification becomes the only scalable trust mechanism, making informal output verification practically impossible at frontier capability levels.
•Lean's Three-Axiom Foundation: All of mathematics, computer science, physics modeling, economics, and statistics can be derived from just three axioms in Lean. The axiom of choice alone states that a non-empty set has a selectable element. Lean's kernel is intentionally small and thoroughly vetted, meaning the surface area requiring human trust is minimal. Mathlib, the open-source repository built on Lean, functions as a computationally certified, searchable encyclopedia of all formalized mathematical knowledge.
•Aristotle's Three-Component Architecture: The system combines a Monte Carlo tree search driven by language models (exploring high-level proof paths, not just grinding small steps), an informal lemma-guessing module that proposes candidate waypoints between proof start and end points (functioning as context management rather than reliable reasoning), and a geometry module modeled on DeepMind's AlphaGeometry. The informal module makes large quantities of errors — its value is generating diverse candidates, not accuracy.
•Formal vs. Informal: The Debate Is Settled: DeepMind's AlphaProof used formal methods for its 2024 silver medal result, then shifted to informal for 2025. OpenAI used informal methods for its 2025 gold result. Harmonic used formal methods for gold. The founders argue that as AI-generated proofs grow longer, informal outputs become unverifiable by humans or other systems. The computational cost of verification must not scale linearly with proof complexity — only formal methods solve this.
•Software Verification as the Next Frontier: The same formal reasoning pipeline that proves mathematical theorems applies directly to software correctness. Aristotle API users are already checking cryptography implementations for collision vulnerabilities and verifying autopilot controller stability. As AI generates million-line codebases autonomously — Cursor reportedly generated a Chromium-compatible browser at 1.5 million lines — Python and Java become inadequate because they optimize for human readability rather than machine-verifiable correctness.
•Reinforcement Learning Without Human Taste Panels: Harmonic has run zero mathematician-led AB testing on proof elegance in two years of operation. The optimization target is net present value of future proof computational cost, which naturally penalizes brute-force grinding because short-term grinding degrades long-term capability. Entropy and hallucination are treated as essential features, not bugs — exploration of false paths is what enables discovery of correct ones. The "bitter lesson" principle guides architecture decisions toward scale over hand-crafted priors.
•Open API as Taste Delegation: Rather than directing Aristotle toward specific unsolved problems internally, Harmonic opened the API publicly and lets revealed user preference determine research priorities. This produced unexpected results: users solved Erdős problems open for 30-40 years, pursued computational learning theory formalizations, and applied the system to graph theory conjectures. The founders argue that a closed lab model where one team selects all problems is both strategically inferior and produces a less desirable world than distributed, community-driven mathematical discovery.

Notable Moment

Tudor described a 2030 scenario where mathematical superintelligence produces multiple self-consistent, competing grand unified theories reconciling quantum mechanics and general relativity — but humanity then faces a data bottleneck, needing exotic high-energy collider experiments to distinguish between them. The host described being momentarily speechless at the concept of theoretical abundance replacing intellectual scarcity as the binding constraint on scientific progress.

Know someone who'd find this useful?

You just read a 3-minute summary of a 88-minute episode.

Get Cognitive Revolution summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Similar Episodes

Related episodes from other podcasts

Masters of Scale

Apr 25

You're clearly into Cognitive Revolution.

Every Monday, we deliver AI summaries of the latest episodes from Cognitive Revolution and 192+ other podcasts. Free for up to 3 shows.

Start My Monday Digest

No credit card · Unsubscribe anytime

Mathematical Superintelligence: Harmonic's Vlad Tenev & Tudor Achim on IMO Gold & Theories of Everything

AI-Generated Summary

Key Takeaways

What It Covers

Key Questions Answered

Notable Moment

Keep Reading

Does Learning Require Feeling? Cameron Berg on the latest AI Consciousness & Welfare Research

Possible: Netflix co-founder Reed Hastings: stories, schools, superpowers

Vibe-Coding an Attention Firewall, w/ Steve Newman, creator of The Curve

The Defense Tech Startup YC Kicked Out of a Meeting is Now Arming America | E2280

More from Cognitive Revolution

Does Learning Require Feeling? Cameron Berg on the latest AI Consciousness & Welfare Research

Vibe-Coding an Attention Firewall, w/ Steve Newman, creator of The Curve

Welcome to AI in the AM: RL for EE, Oversight w/out Nationalization, & the first AI-Run Retail Store

It's Crunch Time: Ajeya Cotra on RSI & AI-Powered AI Safety Work, from the 80,000 Hours Podcast

Calm AI for Crazy Days: Inside Granola's Design Philosophy, with co-founder Sam Stephenson

Similar Episodes

Possible: Netflix co-founder Reed Hastings: stories, schools, superpowers

The Defense Tech Startup YC Kicked Out of a Meeting is Now Arming America | E2280

When does AI become a spending suck?

This guy built a $1B+ brand in 3 years. The product? You'd never guess

#338 Amith Singhee: Can India Catch Up in AI? IBM's Amith Singhee on What It Will Take

You're clearly into Cognitive Revolution.