Skip to main content
Gradient Dissent

The CEO Behind the Fastest-Growing AI Inference Company | Tuhin Srivastava

59 min episode · 2 min read
·

Episode

59 min

Read time

2 min

Topics

Leadership, Artificial Intelligence

AI-Generated Summary

Key Takeaways

  • Company pivoting: Stay lean during market shifts - BaseTen remained 18 people from 2019-2023, enabling rapid pivots when ChatGPT and Stable Diffusion created new opportunities without organizational weight.
  • Inference differentiation: Focus on dedicated capacity over shared endpoints - 99% of BaseTen's business serves custom models with dedicated infrastructure, avoiding commoditized shared model serving markets.
  • Technical optimization: Modern LLM inference requires both infrastructure scaling across thousands of GPUs and runtime optimization using frameworks like VLLM, TensorRT-LLM, and SGLang for performance improvements.
  • Market positioning: Open source adoption follows predictable pattern - companies start with Anthropic/OpenAI, then switch to open source models for cost control, reliability, and data privacy.

What It Covers

BaseTen CEO Tuhin Srivastava explains how his AI inference company pivoted from serving data scientists with small models to becoming fastest-growing inference provider for production applications.

Key Questions Answered

  • Company pivoting: Stay lean during market shifts - BaseTen remained 18 people from 2019-2023, enabling rapid pivots when ChatGPT and Stable Diffusion created new opportunities without organizational weight.
  • Inference differentiation: Focus on dedicated capacity over shared endpoints - 99% of BaseTen's business serves custom models with dedicated infrastructure, avoiding commoditized shared model serving markets.
  • Technical optimization: Modern LLM inference requires both infrastructure scaling across thousands of GPUs and runtime optimization using frameworks like VLLM, TensorRT-LLM, and SGLang for performance improvements.
  • Market positioning: Open source adoption follows predictable pattern - companies start with Anthropic/OpenAI, then switch to open source models for cost control, reliability, and data privacy.

Notable Moment

Srivastava reveals BaseTen killed three of four products in 2022, including an application builder that consumed two dozen employees for 2.5 years of development work.

Know someone who'd find this useful?

You just read a 3-minute summary of a 56-minute episode.

Get Gradient Dissent summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Keep Reading

More from Gradient Dissent

We summarize every new episode. Want them in your inbox?

Similar Episodes

Related episodes from other podcasts

Explore Related Topics

This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.

Read this week's AI & Machine Learning Podcast Insights — cross-podcast analysis updated weekly.

You're clearly into Gradient Dissent.

Every Monday, we deliver AI summaries of the latest episodes from Gradient Dissent and 192+ other podcasts. Free for up to 3 shows.

Start My Monday Digest

No credit card · Unsubscribe anytime