Skip to main content
Deep Questions with Cal Newport

Ep 386: Was 2025 a Great or Terrible Year for AI? (w/ Ed Zitron)

143 min episode · 2 min read
·

Episode

143 min

Read time

2 min

Topics

Artificial Intelligence

AI-Generated Summary

Key Takeaways

  • DeepSeek's efficiency challenge: Chinese startup DeepSeek trained its R1 model for $5.3 million versus American models costing $50-100 billion, demonstrating that frontier AI doesn't require massive data centers. This threatened the industry narrative justifying enormous capital raises, so companies memory-holed the story rather than optimize their own spending.
  • AI agents marketing shift: Companies pivoted from AGI superintelligence messaging to workplace agents in early 2025 because chatbot capabilities had plateaued. The agent narrative promised digital labor replacing workers, but required multi-step LLM queries that increased costs without delivering reliable autonomous task completion beyond simple prototypes.
  • GPT-4.5 router inefficiency: OpenAI's router model that automatically selects optimal models for tasks actually increased inference costs by eliminating system prompt caching. Each model switch required reprocessing the entire system prompt through GPUs, creating overhead that infrastructure teams internally questioned, contradicting public efficiency claims.
  • Anthropic's hidden burn rate: Despite positioning as more efficient than OpenAI, Anthropic spent $2.66 billion on AWS alone in three quarters of 2025, likely matching that on Google Cloud. The company raised $16.5 billion versus OpenAI's $18.3 billion, revealing nearly identical capital consumption rates despite public perception of fiscal discipline.
  • OpenAI's revenue-cost mismatch: Through September 2025, OpenAI generated approximately $4.5 billion in revenue while spending $8.67 billion solely on inference costs to run existing models. This inverse relationship where costs scale directly with revenue demonstrates the fundamental unprofitability of large language model deployment at scale.

What It Covers

Cal Newport and AI commentator Ed Zitron analyze twelve major AI stories from 2025, examining whether the year represented progress or failure for artificial intelligence through technical analysis, financial reporting, and industry insider information about OpenAI, Anthropic, and NVIDIA.

Key Questions Answered

  • DeepSeek's efficiency challenge: Chinese startup DeepSeek trained its R1 model for $5.3 million versus American models costing $50-100 billion, demonstrating that frontier AI doesn't require massive data centers. This threatened the industry narrative justifying enormous capital raises, so companies memory-holed the story rather than optimize their own spending.
  • AI agents marketing shift: Companies pivoted from AGI superintelligence messaging to workplace agents in early 2025 because chatbot capabilities had plateaued. The agent narrative promised digital labor replacing workers, but required multi-step LLM queries that increased costs without delivering reliable autonomous task completion beyond simple prototypes.
  • GPT-4.5 router inefficiency: OpenAI's router model that automatically selects optimal models for tasks actually increased inference costs by eliminating system prompt caching. Each model switch required reprocessing the entire system prompt through GPUs, creating overhead that infrastructure teams internally questioned, contradicting public efficiency claims.
  • Anthropic's hidden burn rate: Despite positioning as more efficient than OpenAI, Anthropic spent $2.66 billion on AWS alone in three quarters of 2025, likely matching that on Google Cloud. The company raised $16.5 billion versus OpenAI's $18.3 billion, revealing nearly identical capital consumption rates despite public perception of fiscal discipline.
  • OpenAI's revenue-cost mismatch: Through September 2025, OpenAI generated approximately $4.5 billion in revenue while spending $8.67 billion solely on inference costs to run existing models. This inverse relationship where costs scale directly with revenue demonstrates the fundamental unprofitability of large language model deployment at scale.

Notable Moment

Jensen Huang announced at March GTC that the AI industry had moved from the pre-training scaling era into post-training and inference, essentially telling shareholders that massive ongoing GPU purchases would be required just to run models, not improve them—benefiting NVIDIA while increasing operational costs for AI companies permanently.

Know someone who'd find this useful?

You just read a 3-minute summary of a 140-minute episode.

Get Deep Questions with Cal Newport summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Keep Reading

More from Deep Questions with Cal Newport

We summarize every new episode. Want them in your inbox?

Similar Episodes

Related episodes from other podcasts

Explore Related Topics

This podcast is featured in Best Mindset Podcasts (2026) — ranked and reviewed with AI summaries.

Read this week's AI & Machine Learning Podcast Insights — cross-podcast analysis updated weekly.

You're clearly into Deep Questions with Cal Newport.

Every Monday, we deliver AI summaries of the latest episodes from Deep Questions with Cal Newport and 192+ other podcasts. Free for up to 3 shows.

Start My Monday Digest

No credit card · Unsubscribe anytime