The Thermodynamic AI Computing Chip - Thomas Ahle
Episode
62 min
Read time
3 min
Topics
Productivity, Fundraising & VC, Leadership
AI-Generated Summary
Key Takeaways
- ✓Thermodynamic Computing Architecture: Normal Computing's CN-101 chip uses arrays of capacitors with programmable resistances to run stochastic differential equations natively. Rather than suppressing thermal noise, the chip harnesses it to compute matrix inverses probabilistically—operations that cost enormous resources on conventional hardware. This makes it particularly suited to Bayesian inference and probabilistic workloads where uncertainty quantification matters, though algorithm redesign is required to fully exploit the architecture.
- ✓AI Agent Chip Design at Scale: Ahle ran approximately 20 GPT agents continuously for six months to build a Verilog simulator from scratch, generating over 500,000 lines of code in 43 days. The motivation was cost: commercial EDA simulation tools run roughly $10,000 per CPU core, making large-scale agentic hardware workflows economically impossible with proprietary software. Open-source alternatives are nearly nonexistent in hardware, unlike software ecosystems.
- ✓Benchmark Deception in Hardware AI: Passing 80% of tests on hardware benchmarks does not mean a design is correct—if a program fails even one test, the implementation is likely wrong. Ahle spoke directly with creators of hardware benchmarks who confirmed this distinction. Practitioners should evaluate AI-generated chip designs on full test-suite pass rates, not partial scores, and treat percentage-based benchmark reporting as a misleading proxy for correctness.
- ✓Auto-Formalization for Chip Verification: Ahle applies a technique analogous to AlphaProof's IMO approach to hardware: language models generate formal specifications in lean-style languages, then attempt to prove or disprove properties of chip designs. A key training trick borrowed from AlphaProof—asking models to prove or disprove rather than just prove—removes the need for correct formalizations during training, enabling scalable synthetic data generation for reinforcement learning on verification tasks.
- ✓Understanding Debt in Agentic Codebases: Large agentic coding projects accumulate what Ahle calls understanding debt—code that passes tests but that no engineer has read or comprehends structurally. With 500,000-line AI-generated codebases, architectural decisions become opaque, blocking future evolution. The practical mitigation is to identify which subsystems require deep human understanding versus which are routine implementations, and deliberately preserve comprehension of the former even when delegating the latter to agents.
What It Covers
Thomas Ahle, researcher at Normal Computing, discusses thermodynamic computing chips that use electrical noise as computation rather than eliminating it, AI-assisted chip design using Verilog simulators built by swarms of agents, formal verification challenges in hardware, and the broader risks of AI-generated code eroding human understanding across engineering teams.
Key Questions Answered
- •Thermodynamic Computing Architecture: Normal Computing's CN-101 chip uses arrays of capacitors with programmable resistances to run stochastic differential equations natively. Rather than suppressing thermal noise, the chip harnesses it to compute matrix inverses probabilistically—operations that cost enormous resources on conventional hardware. This makes it particularly suited to Bayesian inference and probabilistic workloads where uncertainty quantification matters, though algorithm redesign is required to fully exploit the architecture.
- •AI Agent Chip Design at Scale: Ahle ran approximately 20 GPT agents continuously for six months to build a Verilog simulator from scratch, generating over 500,000 lines of code in 43 days. The motivation was cost: commercial EDA simulation tools run roughly $10,000 per CPU core, making large-scale agentic hardware workflows economically impossible with proprietary software. Open-source alternatives are nearly nonexistent in hardware, unlike software ecosystems.
- •Benchmark Deception in Hardware AI: Passing 80% of tests on hardware benchmarks does not mean a design is correct—if a program fails even one test, the implementation is likely wrong. Ahle spoke directly with creators of hardware benchmarks who confirmed this distinction. Practitioners should evaluate AI-generated chip designs on full test-suite pass rates, not partial scores, and treat percentage-based benchmark reporting as a misleading proxy for correctness.
- •Auto-Formalization for Chip Verification: Ahle applies a technique analogous to AlphaProof's IMO approach to hardware: language models generate formal specifications in lean-style languages, then attempt to prove or disprove properties of chip designs. A key training trick borrowed from AlphaProof—asking models to prove or disprove rather than just prove—removes the need for correct formalizations during training, enabling scalable synthetic data generation for reinforcement learning on verification tasks.
- •Understanding Debt in Agentic Codebases: Large agentic coding projects accumulate what Ahle calls understanding debt—code that passes tests but that no engineer has read or comprehends structurally. With 500,000-line AI-generated codebases, architectural decisions become opaque, blocking future evolution. The practical mitigation is to identify which subsystems require deep human understanding versus which are routine implementations, and deliberately preserve comprehension of the former even when delegating the latter to agents.
- •Continual Learning Trade-offs in Hardware: Analog neuromorphic substrates using capacitor-based memory require constant refreshing, making perpetual on-chip learning a practical necessity rather than a design choice. This mirrors biological synaptic plasticity. However, Anthropic and others treat real-time weight updates as a safety risk because live learning can drift models away from alignment checkpoints. Hybrid approaches using shared base models with per-customer LoRA adapters represent one partial solution currently being explored.
Notable Moment
Ahle describes a paradox in AI tool adoption: language models can genuinely accelerate knowledge acquisition for curious, diligent users, yet on average they erode understanding across teams. The mechanism is structural—engineers optimize for passing tests rather than comprehension, and AI enables starting so many parallel projects that deep understanding of any single one becomes impossible.
You just read a 3-minute summary of a 59-minute episode.
Get Machine Learning Street Talk summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Machine Learning Street Talk
He won a Nobel here for AlphaFold. Then he left. - John Jumper
Jun 22 · 53 min
The Jordan Harbinger Show
1351: Alcohol | Skeptical Sunday
Jun 28
More from Machine Learning Street Talk
When AI Decides You're a Threat — Brad Carson
May 31 · 80 min
This Week in Startups
This Startup Fused Human Brain Cells with Silicon Chips | E2295
Jun 1
Books, tools, and gear mentioned in this episode
SignalCast may earn commission on purchases via these links. As an Amazon Associate, SignalCast earns from qualifying purchases.
Tools
“Ahle ran approximately 20 GPT agents continuously for six months to build a Verilog simulator from scratch, generating over 500,000 lines of code in 43 days.”
“Ahle applies a technique analogous to AlphaProof's IMO approach to hardware: language models generate formal specifications in lean-style languages, then attempt to prove or disprove properties of chip designs. A key training trick borrowed from AlphaProof—asking models to prove or disprove rather than just prove—removes the need for correct formalizations during training.”
More from Machine Learning Street Talk
We summarize every new episode. Want them in your inbox?
He won a Nobel here for AlphaFold. Then he left. - John Jumper
When AI Decides You're a Threat — Brad Carson
Intelligence is collective, not artificial — Prof. Michael I. Jordan (UC Berkeley / Inria)
The AI Models Smart Enough to Know They're Cheating — Beth Barnes & David Rein [METR]
When AI Discovers The Next Transformer - Robert Lange (Sakana)
Similar Episodes
Related episodes from other podcasts
The Jordan Harbinger Show
Jun 28
1351: Alcohol | Skeptical Sunday
This Week in Startups
Jun 1
This Startup Fused Human Brain Cells with Silicon Chips | E2295
Practical AI
Mar 9
AI policy and the battle for computing power
Eye on AI
Dec 16
#307 Steven Brightfield: How Neuromorphic Computing Cuts Inference Power by 10x
The Journal
Dec 9
The Tech CEO Leading Nvidia's Main Rival
Explore Related Topics
This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.
You're clearly into Machine Learning Street Talk.
Every Monday, we deliver AI summaries of the latest episodes from Machine Learning Street Talk and 192+ other podcasts. Free for one show.
Start My Monday DigestNo credit card · Unsubscribe anytime