What are the key takeaways from this Practical AI episode?

Key insights include: **End-to-End Training Architecture:** Comma AI trains models directly from hundreds of millions of miles of human driving data, skipping intermediate detection layers like lane-line segmentation or traffic-light classifiers entirely. The model takes raw camera video as input and outputs two values: longitudinal acceleration and road curvature. This minimal output design keeps the on-device model small enough to run on a phone-grade chip.; **Diffusion Simulator as Training Environment:** Rather than classical depth-reprojection simulators, Comma AI now trains its driving policy inside a machine-learning-generated video simulator built on diffusion models. The critical differentiator is input-response accuracy — if the simulator is told the car turns left 10 degrees, it must produce video that precisely reflects that turn, not just photorealistic footage, making it viable for robotics training.; **Compute Gap and Its Practical Ceiling:** Comma AI's current device runs roughly 100 times less compute than a Tesla FSD computer. Despite this gap, highway performance is comparable because capability gains require exponential compute increases for marginal real-world improvements. A planned external GPU add-on targeting 100x more compute is projected to roughly double detection reliability in nuanced situations like ambiguous traffic lights.

What did Harold Schafer discuss on Practical AI?

Harald Sch, CTO at Comma AI, explains how OpenPilot — the most popular open source robotics project on GitHub — uses end-to-end machine learning and a diffusion-based world model simulator to deliver highway autonomy across supported vehicles, while outlining three unsolved problems blocking full autonomous driving: controls, reinforcement learning, and continual learning. Key topics include: **End-to-End Training Architecture:** Comma AI trains models directly from hundreds of millions of miles of human driving data, skipping intermediate detection layers like lane-line segmentation or traffic-light classifiers entirely. The model takes raw camera video as input and outputs two values: longitudinal acceleration and road curvature. This minimal output design keeps the on-device model small enough to run on a phone-grade chip.; **Diffusion Simulator as Training Environment:** Rather than classical depth-reprojection simulators, Comma AI now trains its driving policy inside a machine-learning-generated video simulator built on diffusion models. The critical differentiator is input-response accuracy — if the simulator is told the car turns left 10 degrees, it must produce video that precisely reflects that turn, not just photorealistic footage, making it viable for robotics training..

How long is this episode of Practical AI?

This episode is 46 minutes long. SignalCast provides an AI-generated summary so you can get the key insights in about 3 minutes.

Practical AI

Open Source Self-Driving with Comma AI

April 16, 2026

46 min episode · 2 min read

Harold Schafer

Episode

46 min

Read time

2 min

Topics

Fundraising & VC, Design & UX, Artificial Intelligence

AI-Generated Summary

Published Apr 16, 2026

Key Takeaways

✓End-to-End Training Architecture: Comma AI trains models directly from hundreds of millions of miles of human driving data, skipping intermediate detection layers like lane-line segmentation or traffic-light classifiers entirely. The model takes raw camera video as input and outputs two values: longitudinal acceleration and road curvature. This minimal output design keeps the on-device model small enough to run on a phone-grade chip.
✓Diffusion Simulator as Training Environment: Rather than classical depth-reprojection simulators, Comma AI now trains its driving policy inside a machine-learning-generated video simulator built on diffusion models. The critical differentiator is input-response accuracy — if the simulator is told the car turns left 10 degrees, it must produce video that precisely reflects that turn, not just photorealistic footage, making it viable for robotics training.
✓Compute Gap and Its Practical Ceiling: Comma AI's current device runs roughly 100 times less compute than a Tesla FSD computer. Despite this gap, highway performance is comparable because capability gains require exponential compute increases for marginal real-world improvements. A planned external GPU add-on targeting 100x more compute is projected to roughly double detection reliability in nuanced situations like ambiguous traffic lights.
✓Continual Learning as an Unsolved Requirement: OpenPilot currently uses classical optimization to learn vehicle-specific parameters — tire stiffness, friction coefficients — live during each drive. Inflating tires or driving in rain changes vehicle dynamics that the system must adapt to in real time. Standard neural network approaches cannot yet handle this live adaptation, making continual learning one of three critical unsolved problems for production autonomy.
✓Open Source as a Functional Requirement, Not Just Philosophy: Supporting hundreds of car models requires community contributors to reverse-engineer each vehicle's CAN bus signals. A closed-source stack would make this ecosystem impossible to scale. Comma AI treats open sourcing the car-interface layer as a structural necessity, while also holding a philosophical position that device owners should have full visibility into and control over software running on hardware they purchase.

What It Covers

Harald Sch, CTO at Comma AI, explains how OpenPilot — the most popular open source robotics project on GitHub — uses end-to-end machine learning and a diffusion-based world model simulator to deliver highway autonomy across supported vehicles, while outlining three unsolved problems blocking full autonomous driving: controls, reinforcement learning, and continual learning.

Key Questions Answered

•End-to-End Training Architecture: Comma AI trains models directly from hundreds of millions of miles of human driving data, skipping intermediate detection layers like lane-line segmentation or traffic-light classifiers entirely. The model takes raw camera video as input and outputs two values: longitudinal acceleration and road curvature. This minimal output design keeps the on-device model small enough to run on a phone-grade chip.
•Diffusion Simulator as Training Environment: Rather than classical depth-reprojection simulators, Comma AI now trains its driving policy inside a machine-learning-generated video simulator built on diffusion models. The critical differentiator is input-response accuracy — if the simulator is told the car turns left 10 degrees, it must produce video that precisely reflects that turn, not just photorealistic footage, making it viable for robotics training.
•Compute Gap and Its Practical Ceiling: Comma AI's current device runs roughly 100 times less compute than a Tesla FSD computer. Despite this gap, highway performance is comparable because capability gains require exponential compute increases for marginal real-world improvements. A planned external GPU add-on targeting 100x more compute is projected to roughly double detection reliability in nuanced situations like ambiguous traffic lights.
•Continual Learning as an Unsolved Requirement: OpenPilot currently uses classical optimization to learn vehicle-specific parameters — tire stiffness, friction coefficients — live during each drive. Inflating tires or driving in rain changes vehicle dynamics that the system must adapt to in real time. Standard neural network approaches cannot yet handle this live adaptation, making continual learning one of three critical unsolved problems for production autonomy.
•Open Source as a Functional Requirement, Not Just Philosophy: Supporting hundreds of car models requires community contributors to reverse-engineer each vehicle's CAN bus signals. A closed-source stack would make this ecosystem impossible to scale. Comma AI treats open sourcing the car-interface layer as a structural necessity, while also holding a philosophical position that device owners should have full visibility into and control over software running on hardware they purchase.

Notable Moment

Harald Sch referenced the early Waymo TED talk where a founder predicted his young children would never need driver's licenses — those children now have licenses. He used this to frame how far autonomous driving has come while calibrating realistic expectations about how far it still needs to go.

Know someone who'd find this useful?

You just read a 3-minute summary of a 43-minute episode.

Get Practical AI summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Books, tools, and gear mentioned in this episode

SignalCast may earn commission on purchases via these links.

Tools

OpenPilotBy guest
by Comma AI
“Harald Sch, CTO at Comma AI, explains how OpenPilot — the most popular open source robotics project on GitHub — uses end-to-end machine learning and a diffusion-based world model simulator to deliver highway autonomy across supported vehicles”

Similar Episodes

Related episodes from other podcasts

NVIDIA AI Podcast

May 27

Explore Related Topics

💰Fundraising & VC 🎨Design & UX 🤖Artificial Intelligence

This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.

Read this week's AI & Machine Learning Podcast Insights — cross-podcast analysis updated weekly.

You're clearly into Practical AI.

Every Monday, we deliver AI summaries of the latest episodes from Practical AI and 192+ other podcasts. Free for one show.

Start My Monday Digest

No credit card · Unsubscribe anytime

Open Source Self-Driving with Comma AI

AI-Generated Summary

Key Takeaways

What It Covers

Key Questions Answered

Notable Moment

Keep Reading

The Future of AI Infrastructure with CoreWeave

Everyone Can Build a Robot: Open Source Embodied AI With Seeed Studio | NVIDIA AI Podcast Ep. 300

Building Durable AI Agents

NVIDIA’s Jensen Huang on Reasoning Models, Robotics, and Refuting the “AI Bubble” Narrative

Books, tools, and gear mentioned in this episode

Tools

More from Practical AI

The Future of AI Infrastructure with CoreWeave

Building Durable AI Agents

Image Generation and Visual Intelligence with Black Forest Labs

AIUC-1: Building trust in AI agents

Zero Trust for AI Agents

Similar Episodes

Everyone Can Build a Robot: Open Source Embodied AI With Seeed Studio | NVIDIA AI Podcast Ep. 300

NVIDIA’s Jensen Huang on Reasoning Models, Robotics, and Refuting the “AI Bubble” Narrative

HIGHLIGHTS: Eliot Higgins

Eliot Higgins: How Bellingcat Hunts Down the Truth

AURA and Open-Source Agents for Production Operations

Explore Related Topics

You're clearly into Practical AI.