Turbopuffer with Simon Hørup Eskildsen
Episode
50 min
Read time
2 min
Topics
Fundraising & VC, Artificial Intelligence, Software Development
AI-Generated Summary
Key Takeaways
- ✓Storage architecture economics: TurboPuffer uses S3 object storage at 2¢ per gigabyte versus traditional in-memory vector databases at $2-5 per gigabyte, achieving 100x cost reduction while maintaining sub-second query performance through strategic caching layers.
- ✓Cluster-based indexing for disk: Graph-based vector indexes require hundreds of milliseconds per jump on S3, making them impractable. Cluster-based indexes fetch centroids and clusters in just three round trips, enabling cold queries under one second on object storage.
- ✓Production recall monitoring: TurboPuffer samples 1% of production queries to measure recall accuracy against exact results, maintaining 90-95% recall across real-world datasets. This catches edge cases that academic benchmarks miss, ensuring consistent search quality at scale.
- ✓Namespace sharding primitive: TurboPuffer maps each namespace to one shard with separate S3 prefixes, supporting over 100 million namespaces. Each namespace can use customer-managed encryption keys, providing isolation equivalent to separate buckets without coordination overhead.
What It Covers
Simon Eskildsen explains how TurboPuffer reduces vector database costs by 95% using object storage instead of memory, enabling companies like Cursor and Notion to scale AI search economically at 2¢ per gigabyte.
Key Questions Answered
- •Storage architecture economics: TurboPuffer uses S3 object storage at 2¢ per gigabyte versus traditional in-memory vector databases at $2-5 per gigabyte, achieving 100x cost reduction while maintaining sub-second query performance through strategic caching layers.
- •Cluster-based indexing for disk: Graph-based vector indexes require hundreds of milliseconds per jump on S3, making them impractable. Cluster-based indexes fetch centroids and clusters in just three round trips, enabling cold queries under one second on object storage.
- •Production recall monitoring: TurboPuffer samples 1% of production queries to measure recall accuracy against exact results, maintaining 90-95% recall across real-world datasets. This catches edge cases that academic benchmarks miss, ensuring consistent search quality at scale.
- •Namespace sharding primitive: TurboPuffer maps each namespace to one shard with separate S3 prefixes, supporting over 100 million namespaces. Each namespace can use customer-managed encryption keys, providing isolation equivalent to separate buckets without coordination overhead.
Notable Moment
Eskildsen discovered the vector database cost problem when calculating that storing Readwise article embeddings would cost $30,000 monthly versus $3,000 for their entire Postgres database, revealing a 10x cost amplification blocking AI feature adoption.
You just read a 3-minute summary of a 47-minute episode.
Get Software Engineering Daily summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Software Engineering Daily
Developing Multiplayer Games in Godot
Jun 11 · 46 min
Latent Space
Retrieval After RAG: Hybrid Search, Agents, and Database Design — Simon Hørup Eskildsen of Turbopuffer
Mar 12
More from Software Engineering Daily
SED News: Apple’s AI Problem, The Real Business Model of AI, and Token Cost Reckoning
Jun 9 · 48 min
NVIDIA AI Podcast
NVIDIA’s Rama Akkiraju on Building the Right AI Infrastructure for Enterprise Success - Ep. 255
May 7
More from Software Engineering Daily
We summarize every new episode. Want them in your inbox?
Developing Multiplayer Games in Godot
SED News: Apple’s AI Problem, The Real Business Model of AI, and Token Cost Reckoning
Web Native Game Development
The Hardware Bottleneck AI Can’t Fix
Autonomous Drone Delivery at Scale
Similar Episodes
Related episodes from other podcasts
Latent Space
Mar 12
Retrieval After RAG: Hybrid Search, Agents, and Database Design — Simon Hørup Eskildsen of Turbopuffer
NVIDIA AI Podcast
May 7
NVIDIA’s Rama Akkiraju on Building the Right AI Infrastructure for Enterprise Success - Ep. 255
Odd Lots
Jun 13
Anjney Midha's Plan to Radically Lower the Price of Compute
Planet Money
Jun 10
Two indicators for lowering the rent
Freakonomics Radio
Jun 10
This Is Your Brain on Pollution (Update)
Explore Related Topics
This podcast is featured in Best Cybersecurity Podcasts (2026) — ranked and reviewed with AI summaries.
Read this week's AI & Machine Learning Podcast Insights — cross-podcast analysis updated weekly.
You're clearly into Software Engineering Daily.
Every Monday, we deliver AI summaries of the latest episodes from Software Engineering Daily and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime