Skip to main content
Machine Learning Street Talk

The Thermodynamic AI Computing Chip - Thomas Ahle

62 min episode · 3 min read
·
Thomas Ahle

Episode

62 min

Read time

3 min

Topics

Productivity, Fundraising & VC, Leadership

AI-Generated Summary

Key Takeaways

  • Thermodynamic Computing Architecture: Normal Computing's CN-101 chip uses arrays of capacitors with programmable resistances to run stochastic differential equations natively. Rather than suppressing thermal noise, the chip harnesses it to compute matrix inverses probabilistically—operations that cost enormous resources on conventional hardware. This makes it particularly suited to Bayesian inference and probabilistic workloads where uncertainty quantification matters, though algorithm redesign is required to fully exploit the architecture.
  • AI Agent Chip Design at Scale: Ahle ran approximately 20 GPT agents continuously for six months to build a Verilog simulator from scratch, generating over 500,000 lines of code in 43 days. The motivation was cost: commercial EDA simulation tools run roughly $10,000 per CPU core, making large-scale agentic hardware workflows economically impossible with proprietary software. Open-source alternatives are nearly nonexistent in hardware, unlike software ecosystems.
  • Benchmark Deception in Hardware AI: Passing 80% of tests on hardware benchmarks does not mean a design is correct—if a program fails even one test, the implementation is likely wrong. Ahle spoke directly with creators of hardware benchmarks who confirmed this distinction. Practitioners should evaluate AI-generated chip designs on full test-suite pass rates, not partial scores, and treat percentage-based benchmark reporting as a misleading proxy for correctness.
  • Auto-Formalization for Chip Verification: Ahle applies a technique analogous to AlphaProof's IMO approach to hardware: language models generate formal specifications in lean-style languages, then attempt to prove or disprove properties of chip designs. A key training trick borrowed from AlphaProof—asking models to prove or disprove rather than just prove—removes the need for correct formalizations during training, enabling scalable synthetic data generation for reinforcement learning on verification tasks.
  • Understanding Debt in Agentic Codebases: Large agentic coding projects accumulate what Ahle calls understanding debt—code that passes tests but that no engineer has read or comprehends structurally. With 500,000-line AI-generated codebases, architectural decisions become opaque, blocking future evolution. The practical mitigation is to identify which subsystems require deep human understanding versus which are routine implementations, and deliberately preserve comprehension of the former even when delegating the latter to agents.

What It Covers

Thomas Ahle, researcher at Normal Computing, discusses thermodynamic computing chips that use electrical noise as computation rather than eliminating it, AI-assisted chip design using Verilog simulators built by swarms of agents, formal verification challenges in hardware, and the broader risks of AI-generated code eroding human understanding across engineering teams.

Key Questions Answered

  • Thermodynamic Computing Architecture: Normal Computing's CN-101 chip uses arrays of capacitors with programmable resistances to run stochastic differential equations natively. Rather than suppressing thermal noise, the chip harnesses it to compute matrix inverses probabilistically—operations that cost enormous resources on conventional hardware. This makes it particularly suited to Bayesian inference and probabilistic workloads where uncertainty quantification matters, though algorithm redesign is required to fully exploit the architecture.
  • AI Agent Chip Design at Scale: Ahle ran approximately 20 GPT agents continuously for six months to build a Verilog simulator from scratch, generating over 500,000 lines of code in 43 days. The motivation was cost: commercial EDA simulation tools run roughly $10,000 per CPU core, making large-scale agentic hardware workflows economically impossible with proprietary software. Open-source alternatives are nearly nonexistent in hardware, unlike software ecosystems.
  • Benchmark Deception in Hardware AI: Passing 80% of tests on hardware benchmarks does not mean a design is correct—if a program fails even one test, the implementation is likely wrong. Ahle spoke directly with creators of hardware benchmarks who confirmed this distinction. Practitioners should evaluate AI-generated chip designs on full test-suite pass rates, not partial scores, and treat percentage-based benchmark reporting as a misleading proxy for correctness.
  • Auto-Formalization for Chip Verification: Ahle applies a technique analogous to AlphaProof's IMO approach to hardware: language models generate formal specifications in lean-style languages, then attempt to prove or disprove properties of chip designs. A key training trick borrowed from AlphaProof—asking models to prove or disprove rather than just prove—removes the need for correct formalizations during training, enabling scalable synthetic data generation for reinforcement learning on verification tasks.
  • Understanding Debt in Agentic Codebases: Large agentic coding projects accumulate what Ahle calls understanding debt—code that passes tests but that no engineer has read or comprehends structurally. With 500,000-line AI-generated codebases, architectural decisions become opaque, blocking future evolution. The practical mitigation is to identify which subsystems require deep human understanding versus which are routine implementations, and deliberately preserve comprehension of the former even when delegating the latter to agents.
  • Continual Learning Trade-offs in Hardware: Analog neuromorphic substrates using capacitor-based memory require constant refreshing, making perpetual on-chip learning a practical necessity rather than a design choice. This mirrors biological synaptic plasticity. However, Anthropic and others treat real-time weight updates as a safety risk because live learning can drift models away from alignment checkpoints. Hybrid approaches using shared base models with per-customer LoRA adapters represent one partial solution currently being explored.

Notable Moment

Ahle describes a paradox in AI tool adoption: language models can genuinely accelerate knowledge acquisition for curious, diligent users, yet on average they erode understanding across teams. The mechanism is structural—engineers optimize for passing tests rather than comprehension, and AI enables starting so many parallel projects that deep understanding of any single one becomes impossible.

Know someone who'd find this useful?

You just read a 3-minute summary of a 59-minute episode.

Get Machine Learning Street Talk summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free
Part of this week's recap (Jun 22 – Jun 28)

Keep Reading

Books, tools, and gear mentioned in this episode

SignalCast may earn commission on purchases via these links. As an Amazon Associate, SignalCast earns from qualifying purchases.

Tools

  • Ahle ran approximately 20 GPT agents continuously for six months to build a Verilog simulator from scratch, generating over 500,000 lines of code in 43 days.
  • Ahle applies a technique analogous to AlphaProof's IMO approach to hardware: language models generate formal specifications in lean-style languages, then attempt to prove or disprove properties of chip designs. A key training trick borrowed from AlphaProof—asking models to prove or disprove rather than just prove—removes the need for correct formalizations during training.

Gear

  • by Normal Computing

    Normal Computing's CN-101 chip uses arrays of capacitors with programmable resistances to run stochastic differential equations natively. Rather than suppressing thermal noise, the chip harnesses it to compute matrix inverses probabilistically.

More from Machine Learning Street Talk

We summarize every new episode. Want them in your inbox?

Similar Episodes

Related episodes from other podcasts

Explore Related Topics

This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.

You're clearly into Machine Learning Street Talk.

Every Monday, we deliver AI summaries of the latest episodes from Machine Learning Street Talk and 192+ other podcasts. Free for one show.

Start My Monday Digest

No credit card · Unsubscribe anytime