Skip to main content
The Long Run with Luke Timmerman

Ep202: Becky Pferdehirt on Reimagining Science for the AI Era

68 min episode · 3 min read
·

Episode

68 min

Read time

3 min

Topics

Artificial Intelligence, Science & Discovery

AI-Generated Summary

Key Takeaways

  • Publishing policy as leverage point: Radial prohibits researchers from spending time or money on traditional journal articles. Instead, all outputs must be shared openly and immediately under open licenses in FAIR-compliant venues. The rationale: journal publications are the "gravitational center of dysfunction" — they cause data hoarding, delayed methods, and narrative-driven science that distorts what research actually gets done.
  • Diffuse scattering as untapped AI training data: X-ray crystallography generates a secondary signal called diffuse scattering that crystallographers have discarded as background noise for decades. Radial's Diffuse Project hypothesizes this signal encodes protein conformational dynamics — the ensemble of structures a protein adopts — and is now building a pipeline from sample prep through PDB-compatible databases to test whether this data can train next-generation protein dynamics models beyond AlphaFold.
  • Industry platforms as public-good data generators: Rather than defaulting to academic labs for open-source data generation, Radial funds biotech companies with existing platforms to produce public datasets. The ADME data project with Octant demonstrates this: companies share data that isn't their competitive moat (composition of matter remains proprietary) while philanthropy funds the generation cost, producing universally accessible predictive models for absorption, distribution, metabolism, and excretion.
  • Funding criteria for philanthropic capital deployment: Radial applies a three-question filter before committing resources: Is this something only Radial could do, or only Radial would do? If funded, would the work be done differently and better? If the answer is simply "more funding solves it," a grant suffices. If coordination and full-time talent are required, Radial builds an internal hub with external collaborators. If no one is doing it, Radial hires directly.
  • Data-driven science selection over fundability: Too much research is designed around what grant committees will approve rather than what information would most advance a field. Radial's approach asks: if maximizing informational delta were the goal, what experiments would you run? This reframes funding decisions toward filling gaps in a knowledge graph of biology rather than adding incremental omics datasets to already-crowded areas.

What It Covers

Becky Pferdehirt, CEO of Radial at the Astera Institute, explains how a $500M philanthropic commitment aims to fix structural failures across academia, industry, and venture capital by building open-source data infrastructure, rethinking publication incentives, and creating the data conditions necessary for AI to advance biological understanding.

Key Questions Answered

  • Publishing policy as leverage point: Radial prohibits researchers from spending time or money on traditional journal articles. Instead, all outputs must be shared openly and immediately under open licenses in FAIR-compliant venues. The rationale: journal publications are the "gravitational center of dysfunction" — they cause data hoarding, delayed methods, and narrative-driven science that distorts what research actually gets done.
  • Diffuse scattering as untapped AI training data: X-ray crystallography generates a secondary signal called diffuse scattering that crystallographers have discarded as background noise for decades. Radial's Diffuse Project hypothesizes this signal encodes protein conformational dynamics — the ensemble of structures a protein adopts — and is now building a pipeline from sample prep through PDB-compatible databases to test whether this data can train next-generation protein dynamics models beyond AlphaFold.
  • Industry platforms as public-good data generators: Rather than defaulting to academic labs for open-source data generation, Radial funds biotech companies with existing platforms to produce public datasets. The ADME data project with Octant demonstrates this: companies share data that isn't their competitive moat (composition of matter remains proprietary) while philanthropy funds the generation cost, producing universally accessible predictive models for absorption, distribution, metabolism, and excretion.
  • Funding criteria for philanthropic capital deployment: Radial applies a three-question filter before committing resources: Is this something only Radial could do, or only Radial would do? If funded, would the work be done differently and better? If the answer is simply "more funding solves it," a grant suffices. If coordination and full-time talent are required, Radial builds an internal hub with external collaborators. If no one is doing it, Radial hires directly.
  • Data-driven science selection over fundability: Too much research is designed around what grant committees will approve rather than what information would most advance a field. Radial's approach asks: if maximizing informational delta were the goal, what experiments would you run? This reframes funding decisions toward filling gaps in a knowledge graph of biology rather than adding incremental omics datasets to already-crowded areas.
  • Industry postdocs as underutilized career paths: Scientists wanting translational exposure without permanently leaving research should consider industry postdocs at companies like Genentech or Amgen. These positions offer publishable basic research, drug discovery mentorship, team science experience, and exposure to multiple career paths — all within a structured safety net. The hybrid format keeps academic and industry doors open simultaneously, unlike a traditional postdoc or entry-level industry role.

Notable Moment

Pferdehirt describes a scenario where a researcher publishes raw experimental data — method, code, results — immediately at week's end without writing a narrative paper. That data then gets ingested by an LLM, improves model fine-tuning, and connects with another lab's dataset to demonstrate reproducibility, all without a single journal submission.

Know someone who'd find this useful?

You just read a 3-minute summary of a 65-minute episode.

Get The Long Run with Luke Timmerman summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Keep Reading

More from The Long Run with Luke Timmerman

We summarize every new episode. Want them in your inbox?

Similar Episodes

Related episodes from other podcasts

Explore Related Topics

This podcast is featured in Best Biotech Podcasts (2026) — ranked and reviewed with AI summaries.

Read this week's AI & Machine Learning Podcast Insights — cross-podcast analysis updated weekly.

You're clearly into The Long Run with Luke Timmerman.

Every Monday, we deliver AI summaries of the latest episodes from The Long Run with Luke Timmerman and 192+ other podcasts. Free for up to 3 shows.

Start My Monday Digest

No credit card · Unsubscribe anytime