Dylan Patel — Deep dive on the 3 big bottlenecks to scaling AI compute

March 13, 2026

151 min episode · 3 min read

Dylan Patel

Episode

151 min

Read time

3 min

Topics

Startups, Artificial Intelligence

AI-Generated Summary

Published Mar 13, 2026

Key Takeaways

✓CapEx-to-compute timeline: Hyperscaler CapEx announcements (Google's $180B, combined big-four $600B) do not represent compute coming online this year. A significant portion funds turbine deposits for 2028–2029, data center construction for 2027, and power purchasing agreements years out. Roughly 20 incremental gigawatts deploy in the US this year. Investors and analysts should model CapEx as a multi-year pipeline, not a current-year capacity signal, to avoid misreading AI infrastructure buildout pace.
✓Commitment timing determines compute margins: AI labs that signed five-year compute contracts early (OpenAI with Microsoft, CoreWeave, Oracle) locked in pricing at roughly $1.40/hour for H100s. Anthropic's conservative approach forced it to acquire last-minute capacity at spot rates as high as $2.40/hour for two-to-three year Hopper deals. The margin difference between early commitment and late acquisition is roughly 70%, making compute procurement timing one of the highest-leverage strategic decisions an AI lab makes.
✓GPU value appreciates with model capability: Contrary to Michael Burry's thesis that GPU depreciation cycles are two to three years, H100s are worth more today than at launch. GPT-5.4 runs on H100s at higher token throughput than GPT-4 did, while producing higher-quality output. As models improve efficiency and capability simultaneously, older hardware serving newer models extracts more economic value per chip. This inverts standard depreciation logic and supports longer five-year contract structures for cloud providers.
✓ASML EUV tools are the hard ceiling on AI compute: ASML produces roughly 70 EUV tools annually, scaling to approximately 100 by 2030. Each gigawatt of AI data center capacity requires 3.5 EUV tools worth of wafer passes across logic and memory. With 700 total EUV tools deployed by decade's end and assuming 25% AI allocation, the realistic ceiling is around 50 gigawatts per year — consistent with Sam Altman's targets but leaving no room for Elon Musk's 100-gigawatt ambitions without crowding out all consumer semiconductor production.
✓HBM memory creates a 4x demand destruction multiplier on consumer devices: HBM requires three to four times more DRAM wafer area per bit than standard DRAM. Every byte of HBM capacity allocated to AI effectively destroys four bytes of consumer device capacity. This is already cutting mid-range and low-end smartphone volumes — Xiaomi and Oppo are reportedly halving production. iPhone memory costs are projected to rise by $100–$150 per unit. Investors in consumer electronics should model significant margin compression and volume decline through 2026.

What It Covers

Dylan Patel, CEO of SemiAnalysis, breaks down the three compounding bottlenecks constraining AI compute scaling through 2030: semiconductor manufacturing capacity (logic wafers, HBM memory, EUV tooling), power and data center infrastructure, and capital deployment timing. The conversation quantifies how $600B in hyperscaler CapEx translates to actual gigawatts, why Anthropic undershot compute commitments, and why ASML's 70 machines per year caps the entire AI buildout.

Key Questions Answered

•CapEx-to-compute timeline: Hyperscaler CapEx announcements (Google's $180B, combined big-four $600B) do not represent compute coming online this year. A significant portion funds turbine deposits for 2028–2029, data center construction for 2027, and power purchasing agreements years out. Roughly 20 incremental gigawatts deploy in the US this year. Investors and analysts should model CapEx as a multi-year pipeline, not a current-year capacity signal, to avoid misreading AI infrastructure buildout pace.
•Commitment timing determines compute margins: AI labs that signed five-year compute contracts early (OpenAI with Microsoft, CoreWeave, Oracle) locked in pricing at roughly $1.40/hour for H100s. Anthropic's conservative approach forced it to acquire last-minute capacity at spot rates as high as $2.40/hour for two-to-three year Hopper deals. The margin difference between early commitment and late acquisition is roughly 70%, making compute procurement timing one of the highest-leverage strategic decisions an AI lab makes.
•GPU value appreciates with model capability: Contrary to Michael Burry's thesis that GPU depreciation cycles are two to three years, H100s are worth more today than at launch. GPT-5.4 runs on H100s at higher token throughput than GPT-4 did, while producing higher-quality output. As models improve efficiency and capability simultaneously, older hardware serving newer models extracts more economic value per chip. This inverts standard depreciation logic and supports longer five-year contract structures for cloud providers.
•ASML EUV tools are the hard ceiling on AI compute: ASML produces roughly 70 EUV tools annually, scaling to approximately 100 by 2030. Each gigawatt of AI data center capacity requires 3.5 EUV tools worth of wafer passes across logic and memory. With 700 total EUV tools deployed by decade's end and assuming 25% AI allocation, the realistic ceiling is around 50 gigawatts per year — consistent with Sam Altman's targets but leaving no room for Elon Musk's 100-gigawatt ambitions without crowding out all consumer semiconductor production.
•HBM memory creates a 4x demand destruction multiplier on consumer devices: HBM requires three to four times more DRAM wafer area per bit than standard DRAM. Every byte of HBM capacity allocated to AI effectively destroys four bytes of consumer device capacity. This is already cutting mid-range and low-end smartphone volumes — Xiaomi and Oppo are reportedly halving production. iPhone memory costs are projected to rise by $100–$150 per unit. Investors in consumer electronics should model significant margin compression and volume decline through 2026.
•Inference performance gap between Hopper and Blackwell is 20x, not 3x: Raw FLOP comparisons between GPU generations understate real-world performance differences. Running DeepSeek or Kimi K2.5 on Hopper versus Blackwell yields roughly 20x throughput difference at 100 tokens per second, driven by memory bandwidth, NVLink interconnect speed, and architectural improvements rather than transistor count alone. This means going back to older process nodes (seven nanometer DUV) to bypass EUV constraints would sacrifice far more performance than FLOP-per-dollar calculations suggest.
•Fast AI timelines favor the US; slow timelines favor China: China currently operates entirely on ASML DUV tools and lacks indigenized EUV capability, though working tools are plausible by 2030 with mass production lagging several years behind. If AI revenue compounds fast enough that US labs reach 10+ gigawatt scale by end of 2026, the economic gap widens before China can close the semiconductor supply chain. If capability timelines extend to 2035, China's vertically integrated domestic supply chain and engineering scale become structural advantages.

Notable Moment

Patel reveals that Carl Zeiss — the German optics supplier whose precision mirrors are the bottleneck inside ASML's EUV machines, which are themselves the bottleneck for all advanced AI chips — has a market capitalization of roughly $2.5 billion. A company worth less than a mid-size tech startup effectively sets the hard ceiling on global AI compute expansion through the end of the decade.

Know someone who'd find this useful?