🔬Beyond AlphaFold: How Boltz is Open-Sourcing the Future of Drug Discovery

February 12, 2026

81 min episode · 2 min read

Gabriela Corso,Jeremy Volven

Episode

81 min

Read time

2 min

Topics

Science & Discovery

AI-Generated Summary

Published Feb 12, 2026

Key Takeaways

✓Training Under Constraints: Boltz-1 was trained only once due to compute limitations, requiring live debugging during training runs. The team stopped training mid-run to fix bugs, then resumed without restarting from scratch. They used Department of Energy cluster resources with two-day training windows followed by week-long queue waits, eventually completing training with help from Genesis compute resources.
✓Coevolution as Structural Hints: AlphaFold models decode evolutionary patterns where amino acid positions that mutate together across species indicate spatial proximity in 3D structure. This coevolutionary data acts like database lookup, providing strong priors that guide models to approximate solution spaces before physics-based refinement finds low-energy states. Models struggle without this evolutionary signal on novel proteins.
✓Generative Modeling Over Regression: AlphaFold3 shifted from predicting single structures to sampling posterior distributions of possible conformations. This generative approach handles uncertainty better than regression, which averages conflicting predictions into incorrect structures. The architecture uses diffusion models with cubic computational complexity from pairwise operations, requiring fewer parameters but more compute than language models.
✓Validation Through Distributed Testing: Boltz coordinated 25 academic and industry labs to test designs across diverse applications, reporting results from 8-10 labs in their paper. For nanobodies targeting 14 novel proteins with no known interactions in training data, they achieved nanomolar binders on two-thirds of targets using just 15 designs per target, demonstrating true generalization beyond training distribution.
✓Atomic-Level Sequence Prediction: BoltzGen predicts both protein structure and sequence simultaneously by encoding amino acids through atomic composition. The model receives blank tokens for designed proteins and predicts atomic positions, which implicitly determine amino acid identity since different residues have unique atomic arrangements. This unified supervision signal scales better than separate discrete and continuous objectives.

What It Covers

Gabriela Corso and Jeremy Volven from Boltz explain how they open-sourced protein structure prediction after AlphaFold3 remained proprietary. They trained their model once with limited compute, fixing bugs mid-training, and built BoltzLab to democratize drug discovery through accessible AI tools for designing proteins and small molecules that bind therapeutic targets.

Key Questions Answered

•Training Under Constraints: Boltz-1 was trained only once due to compute limitations, requiring live debugging during training runs. The team stopped training mid-run to fix bugs, then resumed without restarting from scratch. They used Department of Energy cluster resources with two-day training windows followed by week-long queue waits, eventually completing training with help from Genesis compute resources.
•Coevolution as Structural Hints: AlphaFold models decode evolutionary patterns where amino acid positions that mutate together across species indicate spatial proximity in 3D structure. This coevolutionary data acts like database lookup, providing strong priors that guide models to approximate solution spaces before physics-based refinement finds low-energy states. Models struggle without this evolutionary signal on novel proteins.
•Generative Modeling Over Regression: AlphaFold3 shifted from predicting single structures to sampling posterior distributions of possible conformations. This generative approach handles uncertainty better than regression, which averages conflicting predictions into incorrect structures. The architecture uses diffusion models with cubic computational complexity from pairwise operations, requiring fewer parameters but more compute than language models.
•Validation Through Distributed Testing: Boltz coordinated 25 academic and industry labs to test designs across diverse applications, reporting results from 8-10 labs in their paper. For nanobodies targeting 14 novel proteins with no known interactions in training data, they achieved nanomolar binders on two-thirds of targets using just 15 designs per target, demonstrating true generalization beyond training distribution.
•Atomic-Level Sequence Prediction: BoltzGen predicts both protein structure and sequence simultaneously by encoding amino acids through atomic composition. The model receives blank tokens for designed proteins and predicts atomic positions, which implicitly determine amino acid identity since different residues have unique atomic arrangements. This unified supervision signal scales better than separate discrete and continuous objectives.
•Infrastructure Cost Advantage: Running Boltz models on their platform costs significantly less than self-hosting open-source versions. Their small molecule screening pipeline runs 10x faster than open-source implementations through optimization. Platform users can parallelize 100,000 candidate designs across GPU fleets, completing in minutes what would take weeks serially, amortizing compute costs across customers.

Notable Moment

The team revealed their flagship model went through an unrepeatable training curriculum because they could only afford one training run. While the model trained, they discovered and fixed bugs on the fly, stopping and restarting training multiple times without returning to the beginning. This improvised approach somehow produced a working model that matched AlphaFold3 performance despite the chaotic development process.

Know someone who'd find this useful?

You just read a 3-minute summary of a 78-minute episode.

Get Latent Space summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Keep Reading

AIE Europe Debrief + Agent Labs Thesis: Unsupervised Learning x Latent Space Crossover Special (2026)

Apr 23 · 54 min

Masters of Scale

Possible: Netflix co-founder Reed Hastings: stories, schools, superpowers

Apr 25

Shopify’s AI Phase Transition: 2026 Usage Explosion, Unlimited Opus-4.6 Token Budget, Tangle, Tangent, SimGym — with Mikhail Parakhin, Shopify CTO

Apr 22 · 72 min

The Futur

Why Process is Better Than AI w/ Scott Clum | Ep 430

Apr 25

Similar Episodes

Related episodes from other podcasts

Masters of Scale

Apr 25

Possible: Netflix co-founder Reed Hastings: stories, schools, superpowers

The Futur

Apr 25

Why Process is Better Than AI w/ Scott Clum | Ep 430

20VC (20 Minute VC)

Apr 25

20Product: Replit CEO on Why Coding Models Are Plateauing | Why the SaaS Apocalypse is Justified: Will Incumbents Be Replaced? | Why IDEs Are Dead and Do PMs Survive the Next 3-5 Years with Amjad Masad

This Week in Startups

Apr 25

The Defense Tech Startup YC Kicked Out of a Meeting is Now Arming America | E2280

Marketplace

Apr 24

When does AI become a spending suck?

Explore Related Topics

🔬Science & Discovery

This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.

You're clearly into Latent Space.

Every Monday, we deliver AI summaries of the latest episodes from Latent Space and 192+ other podcasts. Free for up to 3 shows.

Start My Monday Digest

No credit card · Unsubscribe anytime

🔬Beyond AlphaFold: How Boltz is Open-Sourcing the Future of Drug Discovery

AI-Generated Summary

Key Takeaways

What It Covers

Key Questions Answered

Notable Moment

Keep Reading

AIE Europe Debrief + Agent Labs Thesis: Unsupervised Learning x Latent Space Crossover Special (2026)

Possible: Netflix co-founder Reed Hastings: stories, schools, superpowers

Shopify’s AI Phase Transition: 2026 Usage Explosion, Unlimited Opus-4.6 Token Budget, Tangle, Tangent, SimGym — with Mikhail Parakhin, Shopify CTO

Why Process is Better Than AI w/ Scott Clum | Ep 430

More from Latent Space

AIE Europe Debrief + Agent Labs Thesis: Unsupervised Learning x Latent Space Crossover Special (2026)

Shopify’s AI Phase Transition: 2026 Usage Explosion, Unlimited Opus-4.6 Token Budget, Tangle, Tangent, SimGym — with Mikhail Parakhin, Shopify CTO

🔬 Training Transformers to solve 95% failure rate of Cancer Trials — Ron Alfa & Daniel Bear, Noetik

Notion’s Token Town: 5 Rebuilds, 100+ Tools, MCP vs CLIs and the Software Factory Future — Simon Last & Sarah Sachs of Notion

Extreme Harness Engineering for Token Billionaires: 1M LOC, 1B toks/day, 0% human code, 0% human review — Ryan Lopopolo, OpenAI Frontier & Symphony

Similar Episodes

Possible: Netflix co-founder Reed Hastings: stories, schools, superpowers

Why Process is Better Than AI w/ Scott Clum | Ep 430

20Product: Replit CEO on Why Coding Models Are Plateauing | Why the SaaS Apocalypse is Justified: Will Incumbents Be Replaced? | Why IDEs Are Dead and Do PMs Survive the Next 3-5 Years with Amjad Masad

The Defense Tech Startup YC Kicked Out of a Meeting is Now Arming America | E2280

When does AI become a spending suck?

Explore Related Topics

You're clearly into Latent Space.