Rob Walken

Etched - Building AI Hardware to Make Inference Faster and Cheaper - [Invest Like the Best, EP.480]

Invest Like the Best with Patrick O'Shaughnessy

Jun 30, 202687 minCo-founder of Etched

AI Summary

→ WHAT IT COVERS Etched founders Gavin Huberti and Rob Walken explain how they built a transformer-specific AI inference chip on the first tape-out attempt, raising $800M with over $1B in customer demand. They detail their low-voltage inference architecture, cluster-scale memory interconnects, and the vertical integration strategy behind their full rack-scale inference product launched in 2023. → KEY INSIGHTS - **Low-Voltage Inference Architecture:** GPUs thermal-throttle because voltage scales quadratically with power — doubling voltage quadruples power draw. Etched runs at under half the voltage of any competing AI chip by redesigning power delivery planes entirely. This unlocks significantly higher flop density without thermal throttling, enabling more compute per watt. Bitcoin miners already proved sub-quarter-voltage operation was physically possible; the question was whether transformer workloads could be restructured to support it. - **Cluster-Scale Memory as Decode Advantage:** The correct metric for decode performance is not single-chip memory bandwidth but full cluster memory bandwidth. NVIDIA Blackwell chip-to-chip latency runs approximately 4,000 nanoseconds point-to-point, meaning 8-chip tensor-parallel setups deliver far less than 8x throughput gains. Etched built a fully custom interconnect stack above Layer 2, cutting latency by more than 5x, enabling the entire cluster's SRAM and HBM to function as a unified memory pool for token generation. - **Prefetch Everything Before Silicon Returns:** Etched compressed post-silicon bring-up from the industry benchmark of 10 months down to 40 days by completing all parallel work before chips arrived. This included deploying 700 FPGAs running full inference stacks, shipping racks to customer data centers pre-chip for software validation, building thermal mock chips to validate cold plates, and standing up full production lines. Every workstream that did not require physical silicon was finished in advance. - **Project-Based Legend Recruiting:** Map every hard technical problem in the target domain, identify who specifically did the zero-to-one work — not who managed it — then pursue those individuals across 20+ conversations over months. Etched recruited Brian Leuler, who built NVIDIA's HGX and DGX rack systems representing the majority of NVIDIA's revenue, by identifying him as one of three people globally who fit the exact profile needed, then converting two of the other candidates into investors. - **Bimodal Talent Philosophy — Legends Plus First-Principles Thinkers:** Pair domain legends who know what scaled success looks like with young engineers who have no inherited constraints. Legends prevent billion-dollar mistakes; first-principles thinkers take aggressive risks legends would avoid. Etched pairs figures like Leuler with robotics world-record holders like Sanford, who built a functional cold plate prototype in one week — a task conventional thermal engineers would estimate at months — by simply not knowing it was considered impossible. - **Vertical Integration Bounded by Economies of Scale:** Integrate vertically only where doing so adds token capacity or removes a binding constraint — not as a default strategy. Etched builds chips, boards, cold plates, interconnects, and production lines in-house because each was a bottleneck. They do not build data centers because customers are already moving power infrastructure to accommodate Etched hardware. The natural integration boundaries sit at chip fabrication on one end and model architecture on the other, with full-stack ownership between. → NOTABLE MOMENT Rob Walken described uploading a pre-diagnosis photo of his back tumor — taken before his stage-four bone cancer diagnosis at age 16 — to GPT-4V, which immediately flagged it as a potential tumor requiring urgent MRI. A process that took six months of medical evaluation in 2015 took seconds in 2023, motivating his decision to build inference infrastructure. 💼 SPONSORS [{"name": "Ramp", "url": "https://ramp.com/invest"}, {"name": "WorkOS", "url": "https://workos.com"}, {"name": "Rogo (Felix)", "url": "https://rogo.ai/felix"}, {"name": "Vanta", "url": "https://vanta.com/invest"}, {"name": "Ridgeline", "url": "https://ridgeline.ai"}] 🏷️ AI Inference Hardware, Semiconductor Architecture, Venture Capital, Vertical Integration, Transformer Chips, AI Infrastructure

Read Full Summary Listen

Featured On 1 Podcast

Invest Like the Best with Patrick O'Shaughnessy

Top resources Rob Walken mentions

GPT-4V

All Appearances

Etched - Building AI Hardware to Make Inference Faster and Cheaper - [Invest Like the Best, EP.480]

AI Summary

Explore More

Never miss Rob Walken's insights