Anjney Midha's Plan to Radically Lower the Price of Compute
Episode
50 min
Read time
2 min
Topics
Productivity, Investing, Startups
AI-Generated Summary
Key Takeaways
- ✓Compute Utilization Gap: Most independent data centers run below 70% node utilization, and model flop utilization (actual chip usage during workloads) can fall below 11%. Elon Musk's Colossus 2 cluster in Memphis ran at under 60% node utilization. Researchers should measure output efficiency, not chip headcount, when evaluating AI infrastructure investments.
- ✓True Cost of Leased Compute: Long-term GPU leases appear priced at $2.50–3.00 per hour, but because research demand is spiky and teams over-provision for peak loads, the effective cost balloons to $25–28 per hour. AMP's grid reallocates idle capacity to other users, returning the actual price paid closer to the marketed rate.
- ✓Verifiable Feedback Drives Model Progress: AI models improve fastest where task outcomes can be objectively verified — software passing unit tests and pull request reviews, or materials science predictions confirmed by X-ray diffraction. Subjective feedback like "that answer was wrong" produces minimal improvement; structured verification loops are what separate fast-progressing domains from stagnant ones.
- ✓Multiple Frontiers, Not One Winner: The AI landscape contains at least 17 distinct frontiers — software engineering, consumer chat, video generation, scientific discovery — each with different leaders. Anthropic leads coding with under 5,000 employees while Google's 60,000-person team remains close but behind. Corporate AI buyers will increasingly route queries to whichever model is cheapest for a given task, abstracting away brand entirely.
- ✓Model-Harness Co-Design: Breakthroughs like Claude Code result from simultaneous development of model capabilities and the surrounding tooling harness, not harness innovation alone. Teams build the harness to anticipate specific model improvements three months out, then remove third-party tool dependencies once the model internalizes those capabilities — collapsing task completion time by one to two minutes per operation.
What It Covers
Anjney Midha, founder of AMP PBC and early Anthropic backer, explains how software-based compute orchestration can reduce effective GPU costs from $25–28 per hour to the marketed rate of $2.50, by standardizing fragmented chip infrastructure into a unified grid modeled on electricity distribution.
Key Questions Answered
- •Compute Utilization Gap: Most independent data centers run below 70% node utilization, and model flop utilization (actual chip usage during workloads) can fall below 11%. Elon Musk's Colossus 2 cluster in Memphis ran at under 60% node utilization. Researchers should measure output efficiency, not chip headcount, when evaluating AI infrastructure investments.
- •True Cost of Leased Compute: Long-term GPU leases appear priced at $2.50–3.00 per hour, but because research demand is spiky and teams over-provision for peak loads, the effective cost balloons to $25–28 per hour. AMP's grid reallocates idle capacity to other users, returning the actual price paid closer to the marketed rate.
- •Verifiable Feedback Drives Model Progress: AI models improve fastest where task outcomes can be objectively verified — software passing unit tests and pull request reviews, or materials science predictions confirmed by X-ray diffraction. Subjective feedback like "that answer was wrong" produces minimal improvement; structured verification loops are what separate fast-progressing domains from stagnant ones.
- •Multiple Frontiers, Not One Winner: The AI landscape contains at least 17 distinct frontiers — software engineering, consumer chat, video generation, scientific discovery — each with different leaders. Anthropic leads coding with under 5,000 employees while Google's 60,000-person team remains close but behind. Corporate AI buyers will increasingly route queries to whichever model is cheapest for a given task, abstracting away brand entirely.
- •Model-Harness Co-Design: Breakthroughs like Claude Code result from simultaneous development of model capabilities and the surrounding tooling harness, not harness innovation alone. Teams build the harness to anticipate specific model improvements three months out, then remove third-party tool dependencies once the model internalizes those capabilities — collapsing task completion time by one to two minutes per operation.
Notable Moment
Midha reveals that Google's internal compute orchestration system, called Borg, achieved 99% chip utilization — up from 62% when his co-founder Sebastian Lobo joined. AMP is rebuilding that same software layer for the broader research ecosystem, where the industry average remains below 70%.
You just read a 3-minute summary of a 47-minute episode.
Get Odd Lots summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Odd Lots
How a Vibecoded Newsletter Is Making the Hay Market More Transparent
Jun 12 · 40 min
Invest Like the Best with Patrick O'Shaughnessy
Alex Sacerdote - How to Invest Through Technology Cycles - [Invest Like the Best, EP.477]
Jun 9
More from Odd Lots
Why Tomatoes Are the Most Expensive They've Been in Four Decades
Jun 11 · 54 min
20VC (20 Minute VC)
20VC: Anj Midha on Investing $300M into Anthropic | The Early Days of Anthropic & How 21 of 22 VCs Turned it Down | The Four Bottlenecks to Compute | What the China Has Smashed and Why We Should Be Worried
Apr 14
Books, tools, and gear mentioned in this episode
SignalCast may earn commission on purchases via these links. As an Amazon Associate, SignalCast earns from qualifying purchases.
company
“Anjney Midha, founder of AMP PBC and early Anthropic backer, explains how software-based compute orchestration can reduce effective GPU costs from $25–28 per hour to the marketed rate of $2.50”
“Anjney Midha, founder of AMP PBC and early Anthropic backer”
“Anthropic leads coding with under 5,000 employees while Google's 60,000-person team remains close but behind.”
More from Odd Lots
We summarize every new episode. Want them in your inbox?
How a Vibecoded Newsletter Is Making the Hay Market More Transparent
Why Tomatoes Are the Most Expensive They've Been in Four Decades
How CoreWeave Sees the Market for Compute Right Now
Why Susquehanna Is Building a Prediction Markets Business
Inside Hudson River Trading's Blistering Token Burn
Similar Episodes
Related episodes from other podcasts
Invest Like the Best with Patrick O'Shaughnessy
Jun 9
Alex Sacerdote - How to Invest Through Technology Cycles - [Invest Like the Best, EP.477]
20VC (20 Minute VC)
Apr 14
20VC: Anj Midha on Investing $300M into Anthropic | The Early Days of Anthropic & How 21 of 22 VCs Turned it Down | The Four Bottlenecks to Compute | What the China Has Smashed and Why We Should Be Worried
Capital Allocators
Mar 9
Katelin Holloway – Human Side of Venture Investing at 776 (EP.490)
Invest Like the Best with Patrick O'Shaughnessy
Feb 24
Dan Sundheim - The Art of Public and Private Market Investing - [Invest Like the Best, EP.460]
Software Engineering Daily
May 26
The European Startup Scene
Explore Related Topics
This podcast is featured in Best Finance Podcasts (2026) — ranked and reviewed with AI summaries.
Read this week's Investing & Markets Podcast Insights — cross-podcast analysis updated weekly.
You're clearly into Odd Lots.
Every Monday, we deliver AI summaries of the latest episodes from Odd Lots and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime