The CEO Behind the Fastest-Growing AI Inference Company | Tuhin Srivastava
Episode
59 min
Read time
2 min
Topics
Productivity, Startups, Leadership
AI-Generated Summary
Key Takeaways
- ✓Company pivoting: Stay lean during market shifts - BaseTen remained 18 people from 2019-2023, enabling rapid pivots when ChatGPT and Stable Diffusion created new opportunities without organizational weight.
- ✓Inference differentiation: Focus on dedicated capacity over shared endpoints - 99% of BaseTen's business serves custom models with dedicated infrastructure, avoiding commoditized shared model serving markets.
- ✓Technical optimization: Modern LLM inference requires both infrastructure scaling across thousands of GPUs and runtime optimization using frameworks like VLLM, TensorRT-LLM, and SGLang for performance improvements.
- ✓Market positioning: Open source adoption follows predictable pattern - companies start with Anthropic/OpenAI, then switch to open source models for cost control, reliability, and data privacy.
What It Covers
BaseTen CEO Tuhin Srivastava explains how his AI inference company pivoted from serving data scientists with small models to becoming fastest-growing inference provider for production applications.
Key Questions Answered
- •Company pivoting: Stay lean during market shifts - BaseTen remained 18 people from 2019-2023, enabling rapid pivots when ChatGPT and Stable Diffusion created new opportunities without organizational weight.
- •Inference differentiation: Focus on dedicated capacity over shared endpoints - 99% of BaseTen's business serves custom models with dedicated infrastructure, avoiding commoditized shared model serving markets.
- •Technical optimization: Modern LLM inference requires both infrastructure scaling across thousands of GPUs and runtime optimization using frameworks like VLLM, TensorRT-LLM, and SGLang for performance improvements.
- •Market positioning: Open source adoption follows predictable pattern - companies start with Anthropic/OpenAI, then switch to open source models for cost control, reliability, and data privacy.
Notable Moment
Srivastava reveals BaseTen killed three of four products in 2022, including an application builder that consumed two dozen employees for 2.5 years of development work.
You just read a 3-minute summary of a 56-minute episode.
Get Gradient Dissent summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Gradient Dissent
He Raised $70M to Cure Every Disease With AI
May 26 · 74 min
No Priors: Artificial Intelligence | Technology | Startups
Baseten CEO Tuhin Srivastava on the AI Inference Crunch, Custom Models, and Building the Inference Cloud
May 1
More from Gradient Dissent
Uber, Nissan, and Mercedes Chose This Self-Driving Startup | Alex Kendall, Wayve
Apr 15 · 45 min
20VC (20 Minute VC)
20VC: Nebius Co-Founder on AI Infrastructure Bubbles | The Real Impact of Open Source on OpenAI & Anthropic | How Price Elastic is Demand for Compute | Could Nebius Sell 10x More Compute If They Had It & more with Roman Chernin
Jun 8
More from Gradient Dissent
We summarize every new episode. Want them in your inbox?
He Raised $70M to Cure Every Disease With AI
Uber, Nissan, and Mercedes Chose This Self-Driving Startup | Alex Kendall, Wayve
Why Netflix, Uber, and Spotify Never Lag: The Database Nobody Talks About | Aaron Katz
The $64M Bet on an AI That Has to Be Right | Carina Hong, CEO of Axiom
What a $42B Software Co. Really Spends on AI Tools
Similar Episodes
Related episodes from other podcasts
No Priors: Artificial Intelligence | Technology | Startups
May 1
Baseten CEO Tuhin Srivastava on the AI Inference Crunch, Custom Models, and Building the Inference Cloud
20VC (20 Minute VC)
Jun 8
20VC: Nebius Co-Founder on AI Infrastructure Bubbles | The Real Impact of Open Source on OpenAI & Anthropic | How Price Elastic is Demand for Compute | Could Nebius Sell 10x More Compute If They Had It & more with Roman Chernin
In Good Company with Nicolai Tangen
May 29
HIGHLIGHTS: Fabricio Bloisi - CEO of Prosus
Latent Space
May 21
Giving Agents Computers — Ivan Burazin, Daytona
Odd Lots
May 21
Why Cerebras CEO Andrew Feldman Built The World's Largest Computer Chip
Explore Related Topics
This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.
Read this week's Startups & Product Podcast Insights — cross-podcast analysis updated weekly.
You're clearly into Gradient Dissent.
Every Monday, we deliver AI summaries of the latest episodes from Gradient Dissent and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime