Skip to main content
AF

Andrew Feldman

3episodes
3podcasts

Featured On 3 Podcasts

All Appearances

3 episodes

AI Summary

→ WHAT IT COVERS Cerebras founder and CEO Andrew Feldman discusses the company's path from a contrarian wafer-scale chip architecture to a $63 billion public company, covering the 2017–2019 technical breakthrough period, the G42 billion-dollar bridge deal, the $20 billion OpenAI agreement, and why inference speed becomes the defining competitive advantage once AI reaches daily utility. → KEY INSIGHTS - **Radical differentiation threshold:** Achieving 15–20x performance improvement over GPUs requires fundamentally different architecture, not incremental modification. Cerebras built a 46,000 square millimeter wafer-scale chip — the size of a dinner plate — versus competitors' postage-stamp chips. Hardware founders targeting radical gains should design from first principles rather than optimizing existing architectures. - **Market timing for hardware:** Speed advantages have zero commercial value until the underlying technology reaches daily utility. Cerebras was 15–20x faster than GPUs from 2019 onward but generated minimal sales until 2025, when AI models became useful enough for daily work. Hardware founders should plan financially for a 3–5 year gap between technical readiness and market readiness. - **Bridge customer strategy:** To cross the chasm between niche early adopters and mainstream enterprise customers, Cerebras secured a $1 billion order from sovereign partner G42. This single deal funded supply chain transformation, enabled large-scale cluster deployment for battle-testing, and built the operational capacity needed to fulfill the subsequent $20 billion OpenAI agreement. - **Accountability against the sunk-cost trap:** Founders should pre-define specific, falsifiable hypotheses about what conditions must be true to continue. Trusted former CEOs or seasoned operators serve as external accountability partners who can remind founders of their own stated exit criteria, preventing the sequential "one more test" rationalization that extends failing ventures indefinitely. - **AI coding productivity distribution:** Cerebras increased per-engineer token spend from near zero to $25,000–$30,000 monthly within eight months. Productivity gains are highly uneven: engineers who restructure their workflow around governing multiple parallel agents simultaneously — including dedicated QA agents — move from 10x to 100x output, while others see marginal gains. → NOTABLE MOMENT During the $20 billion OpenAI deal negotiation, Cerebras and OpenAI executed a term sheet the night before Thanksgiving and signed a full master agreement on December 24 — a four-and-a-half-week close on one of Silicon Valley's largest contracts, achieved by working seven days a week with multiple law firms simultaneously. 💼 SPONSORS None detected 🏷️ AI Hardware, Inference Speed, Semiconductor Architecture, IPO Strategy, Founder Psychology

AI Summary

→ WHAT IT COVERS Cerebras CEO Andrew Feldman explains how his company built a chip 58 times larger than any competitor, achieving inference speeds 15 times faster than leading GPUs. The episode covers wafer-scale engineering breakthroughs, inference economics, CUDA's declining relevance, open vs. closed source AI models, and semiconductor supply chain constraints. → KEY INSIGHTS - **Wafer-Scale Memory Architecture:** Cerebras achieves 15x faster inference than GPUs—and up to 1,000x faster on specific workloads—by using fast SRAM instead of slow HBM memory. The tradeoff is lower storage density per square millimeter, solved by building a chip covering an entire silicon wafer, roughly dinner-plate sized, stuffed with high-speed memory. - **Speed Premium Pricing:** Anthropic's 2x-faster inference tier sold out at 6x the standard price, demonstrating that enterprise buyers pay significant premiums for speed. Cerebras operates at 15x faster than that tier, suggesting substantial pricing power. Slow tokens cost less to produce on GPUs, but GPU cost-per-token rises sharply as speed requirements increase. - **Supply Chain Differentiation:** Cerebras avoids three major AI chip bottlenecks simultaneously: HBM memory shortages, TSMC's constrained CoWoS packaging process, and TSMC's oversubscribed 3nm node. By using 5nm fabrication and on-chip SRAM, Cerebras sidesteps constraints choking NVIDIA and other GPU vendors, leaving data center availability as the primary growth limiter. - **CUDA Moat Erosion:** CUDA has zero role in inference workloads—migrating a model from GPU to Cerebras requires roughly 10 configuration changes. In training, two of three leading frontier models (Gemini on TPUs, Claude on Trainium) now train without CUDA, representing a 70% market share loss for NVIDIA's software ecosystem compared to three years ago. - **Open vs. Closed Source Economics:** Open source models like Kimi K2 (1 trillion parameters) run on Cerebras today at a cost reflecting only compute and power—not training amortization. Closed source models outperform open source by roughly 4–5% on quality benchmarks but cost significantly more per token, creating a cost-versus-capability tradeoff enterprises must actively evaluate. → NOTABLE MOMENT Feldman reveals that despite solving a 75-year-old unsolvable engineering problem and building the world's fastest inference chip, Cerebras' primary growth constraint today is not manufacturing capacity or software—it is simply the availability of powered data center buildings, a limitation expected to persist for at least 15–18 months. 💼 SPONSORS [{"name": "VanEck", "url": "https://vaneck.com/raaxpod"}, {"name": "IBM", "url": "https://ibm.com"}, {"name": "Adobe Acrobat", "url": "https://adobe.com"}, {"name": "Public", "url": "https://public.com/market"}] 🏷️ AI Inference, Semiconductor Architecture, Chip Manufacturing, CUDA Alternatives, AI Economics

AI Summary

→ WHAT IT COVERS Cerebras CEO Andrew Feldman discusses the company's $1.1 billion Series G raise at $8.1 billion valuation, NVIDIA's competitive position, AI infrastructure bottlenecks, energy requirements for AI deployment, and the concentration of market value in seven technology companies. → KEY INSIGHTS - **Pre-IPO Capital Strategy:** Cerebras raised $1.1 billion from Fidelity and Tiger Global before going public to secure manufacturing capacity and data center expansion without IPO distraction. Getting Fidelity specifically signals Wall Street confidence and validates late-stage valuations for public market readiness. - **Chip Depreciation Reality:** Chip depreciation depends on performance improvement between generations, not arbitrary timelines. Current generation-over-generation gains deliver 2-2.5x actual performance when comparing apples-to-apples metrics like memory bandwidth, not just theoretical flops. System bottlenecks matter more than individual chip speed improvements for real-world applications. - **US Power Infrastructure Myth:** The US has sufficient power for AI expansion but in wrong locations. Abundant natural gas in West Texas and hydro in Upstate New York exist where people, buildings, and fiber optic infrastructure are absent. The challenge is geographic mismatch, not total capacity shortage. - **AI Talent Bottleneck:** Fundamental shortage of AI practitioners and data scientists limits industry growth more than hardware. Universities produce insufficient graduates while immigration policy restricts H-1B and J-1 visa pathways that historically brought top global talent. Companies must pay extraordinary compensation for irreplaceable expertise that no team size can replicate. - **Data Pipeline Investment Gap:** Unsexy infrastructure like data cleaning, tokenization, and pipeline management causes more AI project failures than actual AI technology. These roles receive minimal investment and recognition despite being critical success factors. Many billion-dollar AI initiatives fail on data preparation, not model performance. → NOTABLE MOMENT Feldman reveals that after 15 months burning $6-7 million monthly while unable to manufacture a single working wafer-scale chip, the founding team stood watching their first successful unit run for 30 minutes, having solved a 75-year problem that defeated IBM, Texas Instruments, and Gene Amdahl. 💼 SPONSORS [{"name": "Coda", "url": "https://coda.io/20vc"}, {"name": "Vanta", "url": "https://vanta.com/20vc"}] 🏷️ AI Infrastructure, Chip Architecture, Energy Requirements, Venture Capital, Talent Acquisition

Explore More

Never miss Andrew Feldman's insights

Subscribe to get AI-powered summaries of Andrew Feldman's podcast appearances delivered to your inbox weekly.

Start Free Today

No credit card required • Free tier available