Skip to main content
The Long Run with Luke Timmerman

Ep189: Marc Tessier-Lavigne on Reinventing Drug Discovery with AI

72 min episode · 3 min read
·

Episode

72 min

Read time

3 min

Topics

Artificial Intelligence, Science & Discovery

AI-Generated Summary

Key Takeaways

  • Drugging the Undruggable: Xaira's near-term AI advantage targets multipass membrane proteins with minimal extracellular surface area — targets where traditional antibody screening yields few hits. Because generative AI designs directly to a specified epitope rather than screening a library, it bypasses the physical constraints that make these proteins difficult to drug, opening a category of targets previously considered inaccessible to antibody therapeutics.
  • Generative Protein Design Loop: Xaira runs large-scale design-make-test cycles using RF Diffusion and RF Antibody models, generating millions of AI designs, synthesizing hundreds of thousands in the wet lab, and feeding results back to retrain models. This iterative loop, built on David Baker's 2023–2024 breakthroughs, progressively compresses the path from initial hit to optimized lead candidate, with the ambition of eventually producing near-development-candidate molecules computationally.
  • Causal Biology vs. Descriptive Biology: Existing single-cell foundation models like scGPT excel at descriptive tasks — classifying cell types, identifying gene expression differences — but fail at causal questions such as which genetic perturbation returns a diseased cell to a healthy state. Xaira's strategy is to generate large-scale perturbation datasets (perturbing one gene at a time across 20,000 genes) to train models that understand biological causality, not just correlation.
  • PerturbSeq as a Public Scaffold: In June 2024, Xaira published the largest perturbSeq dataset to date, covering genome-scale perturbations across cell lines, and made it freely available. The dataset generated 64,000 downloads in three months. The rationale: building causal models of biology requires a broad data scaffold across many cell types, which no single organization can generate alone, making open contribution a strategic accelerant rather than a competitive loss.
  • Platform-Product Synergy: Xaira structures itself as both a platform and a product company deliberately. Therapeutic programs generate proprietary wet-lab data that trains and validates AI models, while the models guide which targets to pursue. Without pipeline programs, AI development risks drifting toward academically interesting but therapeutically irrelevant directions. This dual structure also justifies the $1 billion capital raise, since both platform infrastructure and clinical-stage programs require sustained, parallel investment.

What It Covers

Marc Tessier-Lavigne, CEO of Xaira Therapeutics, outlines how the South San Francisco startup deploys AI across all three stages of drug discovery — target identification, molecular design, and patient matching — backed by $1 billion in committed capital and Nobel laureate David Baker's protein design technology, with the goal of halving drug development timelines from 13 to 6.5 years.

Key Questions Answered

  • Drugging the Undruggable: Xaira's near-term AI advantage targets multipass membrane proteins with minimal extracellular surface area — targets where traditional antibody screening yields few hits. Because generative AI designs directly to a specified epitope rather than screening a library, it bypasses the physical constraints that make these proteins difficult to drug, opening a category of targets previously considered inaccessible to antibody therapeutics.
  • Generative Protein Design Loop: Xaira runs large-scale design-make-test cycles using RF Diffusion and RF Antibody models, generating millions of AI designs, synthesizing hundreds of thousands in the wet lab, and feeding results back to retrain models. This iterative loop, built on David Baker's 2023–2024 breakthroughs, progressively compresses the path from initial hit to optimized lead candidate, with the ambition of eventually producing near-development-candidate molecules computationally.
  • Causal Biology vs. Descriptive Biology: Existing single-cell foundation models like scGPT excel at descriptive tasks — classifying cell types, identifying gene expression differences — but fail at causal questions such as which genetic perturbation returns a diseased cell to a healthy state. Xaira's strategy is to generate large-scale perturbation datasets (perturbing one gene at a time across 20,000 genes) to train models that understand biological causality, not just correlation.
  • PerturbSeq as a Public Scaffold: In June 2024, Xaira published the largest perturbSeq dataset to date, covering genome-scale perturbations across cell lines, and made it freely available. The dataset generated 64,000 downloads in three months. The rationale: building causal models of biology requires a broad data scaffold across many cell types, which no single organization can generate alone, making open contribution a strategic accelerant rather than a competitive loss.
  • Platform-Product Synergy: Xaira structures itself as both a platform and a product company deliberately. Therapeutic programs generate proprietary wet-lab data that trains and validates AI models, while the models guide which targets to pursue. Without pipeline programs, AI development risks drifting toward academically interesting but therapeutically irrelevant directions. This dual structure also justifies the $1 billion capital raise, since both platform infrastructure and clinical-stage programs require sustained, parallel investment.
  • R&D Efficiency Targets for 2035: Current industry benchmarks sit at roughly 13 years from target identification to FDA approval and a 10% clinical success rate. Tessier-Lavigne sets a concrete decade-long aspiration: cut timelines to 6.5 years and double or triple success rates to 20–40%. He frames this not as AI-alone but as AI as a primary contributor alongside other advances, arguing the current attrition rate is economically unsustainable for the industry.

Notable Moment

Tessier-Lavigne describes how his decision to study physiology and philosophy at Oxford — reportedly the first such combination in five years — directly shaped his systems-level thinking about biology. The pairing that seemed academically illogical at the time became the cognitive foundation for approaching drug discovery as a causal reasoning problem decades later.

Know someone who'd find this useful?

You just read a 3-minute summary of a 69-minute episode.

Get The Long Run with Luke Timmerman summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Keep Reading

More from The Long Run with Luke Timmerman

We summarize every new episode. Want them in your inbox?

Similar Episodes

Related episodes from other podcasts

Explore Related Topics

This podcast is featured in Best Biotech Podcasts (2026) — ranked and reviewed with AI summaries.

Read this week's AI & Machine Learning Podcast Insights — cross-podcast analysis updated weekly.

You're clearly into The Long Run with Luke Timmerman.

Every Monday, we deliver AI summaries of the latest episodes from The Long Run with Luke Timmerman and 192+ other podcasts. Free for up to 3 shows.

Start My Monday Digest

No credit card · Unsubscribe anytime