Skip to main content
Cognitive Revolution

Mathematical Superintelligence: Harmonic's Vlad Tenev & Tudor Achim on IMO Gold & Theories of Everything

91 min episode · 3 min read
·

Episode

91 min

Read time

3 min

AI-Generated Summary

Key Takeaways

  • Formal Verification as Trust Layer: Aristotle outputs proofs in Lean, a programming language whose kernel checks every logical step against three minimal axioms — propositional extensionality, quotient soundness, and the axiom of choice. This means correctness is computationally certified without human peer review. As AI generates increasingly long proofs (potentially thousands of pages), formal verification becomes the only scalable trust mechanism, making informal output verification practically impossible at frontier capability levels.
  • Lean's Three-Axiom Foundation: All of mathematics, computer science, physics modeling, economics, and statistics can be derived from just three axioms in Lean. The axiom of choice alone states that a non-empty set has a selectable element. Lean's kernel is intentionally small and thoroughly vetted, meaning the surface area requiring human trust is minimal. Mathlib, the open-source repository built on Lean, functions as a computationally certified, searchable encyclopedia of all formalized mathematical knowledge.
  • Aristotle's Three-Component Architecture: The system combines a Monte Carlo tree search driven by language models (exploring high-level proof paths, not just grinding small steps), an informal lemma-guessing module that proposes candidate waypoints between proof start and end points (functioning as context management rather than reliable reasoning), and a geometry module modeled on DeepMind's AlphaGeometry. The informal module makes large quantities of errors — its value is generating diverse candidates, not accuracy.
  • Formal vs. Informal: The Debate Is Settled: DeepMind's AlphaProof used formal methods for its 2024 silver medal result, then shifted to informal for 2025. OpenAI used informal methods for its 2025 gold result. Harmonic used formal methods for gold. The founders argue that as AI-generated proofs grow longer, informal outputs become unverifiable by humans or other systems. The computational cost of verification must not scale linearly with proof complexity — only formal methods solve this.
  • Software Verification as the Next Frontier: The same formal reasoning pipeline that proves mathematical theorems applies directly to software correctness. Aristotle API users are already checking cryptography implementations for collision vulnerabilities and verifying autopilot controller stability. As AI generates million-line codebases autonomously — Cursor reportedly generated a Chromium-compatible browser at 1.5 million lines — Python and Java become inadequate because they optimize for human readability rather than machine-verifiable correctness.

What It Covers

Harmonic co-founders Vlad Tenev and Tudor Achim explain how their AI system Aristotle achieved IMO gold medal performance in 2025 using formally verified proofs in Lean, why formal verification beats informal reasoning at scale, and how mathematical superintelligence could eliminate intellectual bottlenecks across science, software, and engineering by 2030.

Key Questions Answered

  • Formal Verification as Trust Layer: Aristotle outputs proofs in Lean, a programming language whose kernel checks every logical step against three minimal axioms — propositional extensionality, quotient soundness, and the axiom of choice. This means correctness is computationally certified without human peer review. As AI generates increasingly long proofs (potentially thousands of pages), formal verification becomes the only scalable trust mechanism, making informal output verification practically impossible at frontier capability levels.
  • Lean's Three-Axiom Foundation: All of mathematics, computer science, physics modeling, economics, and statistics can be derived from just three axioms in Lean. The axiom of choice alone states that a non-empty set has a selectable element. Lean's kernel is intentionally small and thoroughly vetted, meaning the surface area requiring human trust is minimal. Mathlib, the open-source repository built on Lean, functions as a computationally certified, searchable encyclopedia of all formalized mathematical knowledge.
  • Aristotle's Three-Component Architecture: The system combines a Monte Carlo tree search driven by language models (exploring high-level proof paths, not just grinding small steps), an informal lemma-guessing module that proposes candidate waypoints between proof start and end points (functioning as context management rather than reliable reasoning), and a geometry module modeled on DeepMind's AlphaGeometry. The informal module makes large quantities of errors — its value is generating diverse candidates, not accuracy.
  • Formal vs. Informal: The Debate Is Settled: DeepMind's AlphaProof used formal methods for its 2024 silver medal result, then shifted to informal for 2025. OpenAI used informal methods for its 2025 gold result. Harmonic used formal methods for gold. The founders argue that as AI-generated proofs grow longer, informal outputs become unverifiable by humans or other systems. The computational cost of verification must not scale linearly with proof complexity — only formal methods solve this.
  • Software Verification as the Next Frontier: The same formal reasoning pipeline that proves mathematical theorems applies directly to software correctness. Aristotle API users are already checking cryptography implementations for collision vulnerabilities and verifying autopilot controller stability. As AI generates million-line codebases autonomously — Cursor reportedly generated a Chromium-compatible browser at 1.5 million lines — Python and Java become inadequate because they optimize for human readability rather than machine-verifiable correctness.
  • Reinforcement Learning Without Human Taste Panels: Harmonic has run zero mathematician-led AB testing on proof elegance in two years of operation. The optimization target is net present value of future proof computational cost, which naturally penalizes brute-force grinding because short-term grinding degrades long-term capability. Entropy and hallucination are treated as essential features, not bugs — exploration of false paths is what enables discovery of correct ones. The "bitter lesson" principle guides architecture decisions toward scale over hand-crafted priors.
  • Open API as Taste Delegation: Rather than directing Aristotle toward specific unsolved problems internally, Harmonic opened the API publicly and lets revealed user preference determine research priorities. This produced unexpected results: users solved Erdős problems open for 30-40 years, pursued computational learning theory formalizations, and applied the system to graph theory conjectures. The founders argue that a closed lab model where one team selects all problems is both strategically inferior and produces a less desirable world than distributed, community-driven mathematical discovery.

Notable Moment

Tudor described a 2030 scenario where mathematical superintelligence produces multiple self-consistent, competing grand unified theories reconciling quantum mechanics and general relativity — but humanity then faces a data bottleneck, needing exotic high-energy collider experiments to distinguish between them. The host described being momentarily speechless at the concept of theoretical abundance replacing intellectual scarcity as the binding constraint on scientific progress.

Know someone who'd find this useful?

You just read a 3-minute summary of a 88-minute episode.

Get Cognitive Revolution summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Keep Reading

More from Cognitive Revolution

We summarize every new episode. Want them in your inbox?

Similar Episodes

Related episodes from other podcasts

This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.

You're clearly into Cognitive Revolution.

Every Monday, we deliver AI summaries of the latest episodes from Cognitive Revolution and 192+ other podcasts. Free for up to 3 shows.

Start My Monday Digest

No credit card · Unsubscribe anytime