Snap’s Secret to Processing 10 Petabytes a Day: GPU-Accelerated Spark | NVIDIA AI Podcast Ep. 298
Episode
23 min
Read time
2 min
Topics
Investing, Fundraising & VC, Artificial Intelligence
AI-Generated Summary
Key Takeaways
- ✓GPU workload benchmarking by job type: Before committing to GPU acceleration, benchmark each distinct Spark job category separately. Snap found join-heavy jobs achieved 3x+ speedup, union jobs reached 2x, and aggregation jobs hit 1.5x — because CPUs already handle aggregations efficiently. Matching GPU investment to job type prevents overspending on workloads that won't benefit proportionally.
- ✓Zero-code migration via NVIDIA Spark Rapids: NVIDIA Spark Rapids integrates into existing PySpark workloads without any code changes, only requiring environment and container image configuration. For teams managing large Spark pipelines, this means GPU acceleration can be evaluated and deployed without rewriting jobs, dramatically reducing migration risk and engineering time during the transition period.
- ✓Repurpose idle inference GPUs for batch workloads: Snap identified that online serving GPUs sat idle between 1AM and 5AM as major markets slept. By migrating batch Spark jobs onto Kubernetes-managed GKE clusters already hosting inference workloads, teams can reclaim unused GPU capacity at near-zero incremental cost, provided preemption logic returns resources immediately when live traffic spikes.
- ✓Build graceful fallback chains for production reliability: Snap engineered a three-tier fallback: GPU-accelerated Spark on GKE
What It Covers
Snap's head of engineering platforms, Pruevi Vatala, details how the company migrated its 10-petabyte-per-day A/B testing experimentation pipeline to GPU-accelerated Apache Spark using NVIDIA Spark Rapids on Google Cloud, achieving 76% cost reduction while serving nearly one billion monthly active users.
Key Questions Answered
- •GPU workload benchmarking by job type: Before committing to GPU acceleration, benchmark each distinct Spark job category separately. Snap found join-heavy jobs achieved 3x+ speedup, union jobs reached 2x, and aggregation jobs hit 1.5x — because CPUs already handle aggregations efficiently. Matching GPU investment to job type prevents overspending on workloads that won't benefit proportionally.
- •Zero-code migration via NVIDIA Spark Rapids: NVIDIA Spark Rapids integrates into existing PySpark workloads without any code changes, only requiring environment and container image configuration. For teams managing large Spark pipelines, this means GPU acceleration can be evaluated and deployed without rewriting jobs, dramatically reducing migration risk and engineering time during the transition period.
- •Repurpose idle inference GPUs for batch workloads: Snap identified that online serving GPUs sat idle between 1AM and 5AM as major markets slept. By migrating batch Spark jobs onto Kubernetes-managed GKE clusters already hosting inference workloads, teams can reclaim unused GPU capacity at near-zero incremental cost, provided preemption logic returns resources immediately when live traffic spikes.
- •Build graceful fallback chains for production reliability: Snap engineered a three-tier fallback: GPU-accelerated Spark on GKE
Notable Moment
Snap discovered that GPU capacity for its data pipelines already existed inside the company — sitting completely unused overnight on inference servers. Recognizing that a social platform's usage follows a daily cycle turned an infrastructure bottleneck into a solved problem without purchasing additional hardware.
You just read a 3-minute summary of a 20-minute episode.
Get NVIDIA AI Podcast summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from NVIDIA AI Podcast
Inside Instacart's AI-Powered Smart Shopping Cart | NVIDIA AI Podcast Ep. 302
Jun 24 · 39 min
How I AI
How Intercom 2x’d their engineering velocity in 9 months with Claude Code | Brian Scanlan
Apr 20
More from NVIDIA AI Podcast
How Mistral Is Building Frontier AI for the Enterprise | NVIDIA AI Podcast Ep. 301
Jun 10 · 21 min
Lenny's Podcast
Head of Growth (Anthropic): “Claude is growing itself at this point” | Amol Avasare
Apr 5
Books, tools, and gear mentioned in this episode
SignalCast may earn commission on purchases via these links.
Tools
by NVIDIA
“GPU-accelerated Apache Spark using NVIDIA Spark Rapids on Google Cloud, achieving 76% cost reduction”
by Google
“Snap engineered a three-tier fallback: GPU-accelerated Spark on GKE → CPU-based Spark on GKE → Dataproc clusters”
by NVIDIA
“NVIDIA Ether assisted by auto-tuning Spark parameters across environments, keeping performance consistent”
“migrated its 10-petabyte-per-day A/B testing experimentation pipeline to GPU-accelerated Apache Spark using NVIDIA Spark Rapids on Google Cloud”
by Google
“By migrating batch Spark jobs onto Kubernetes-managed GKE clusters already hosting inference workloads”
More from NVIDIA AI Podcast
We summarize every new episode. Want them in your inbox?
Inside Instacart's AI-Powered Smart Shopping Cart | NVIDIA AI Podcast Ep. 302
How Mistral Is Building Frontier AI for the Enterprise | NVIDIA AI Podcast Ep. 301
Everyone Can Build a Robot: Open Source Embodied AI With Seeed Studio | NVIDIA AI Podcast Ep. 300
Inside AI Tokenomics: How to Profitably Turn Tokens Into Business Value | NVIDIA AI Podcast Ep. 299
Harrison Chase of LangChain on Deep Agents, LangSmith, and Earning Trust | NVIDIA AI Podcast Ep. 297
Similar Episodes
Related episodes from other podcasts
How I AI
Apr 20
How Intercom 2x’d their engineering velocity in 9 months with Claude Code | Brian Scanlan
Lenny's Podcast
Apr 5
Head of Growth (Anthropic): “Claude is growing itself at this point” | Amol Avasare
Eye on AI
Jun 12
AI Is Already Resolving 90% of Customer Service Tickets - and It's Getting Smarter | Shashi Upadhyay, Zendesk
The TWIML AI Podcast
Jun 9
Is RAG Dead? Lessons from Building AI for Tax Law with Alex Bowcut - #769
20VC (20 Minute VC)
Jun 6
20Product: Inside Legora's Tech Stack: Why Token Maxing is Failing Enterprise Startups with Jacob Lauritzen, CTO @ Legora
Explore Related Topics
This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.
Read this week's Investing & Markets Podcast Insights — cross-podcast analysis updated weekly.
You're clearly into NVIDIA AI Podcast.
Every Monday, we deliver AI summaries of the latest episodes from NVIDIA AI Podcast and 192+ other podcasts. Free for one show.
Start My Monday DigestNo credit card · Unsubscribe anytime