Skip to main content
Axial Podcast

Computational Protein Design with Costas Maranas

49 min episode · 2 min read
·

Episode

49 min

Read time

2 min

Topics

Design & UX

AI-Generated Summary

Key Takeaways

  • Inverse Protein Design: The core unsolved challenge in computational protein engineering is the inverse folding problem — given a desired protein structure and function, determine which amino acid sequence produces it. Current biophysical force fields like AMBER and CHARMM carry significant uncertainty, meaning even powerful search algorithms succeed only a fraction of the time, far below the 20–40% hit rate that would make wet lab screening practical.
  • Negative Data Generation: Training machine learning models on protein function requires balanced datasets of both working and non-working variants, yet most published datasets contain only positive results. Maranas advocates for a moonshot-style initiative: systematically engineer hundreds of enzymes spanning diverse EC classifications across prokaryotic, eukaryotic, and archaeal organisms, generating unbiased positive and negative variant data to properly train protein language models.
  • Mathematical Reframing Over Raw Compute: When computational limits block biological design problems, reframing them using established mathematical structures — such as mixed-integer linear programming borrowed from airline scheduling and warehouse logistics — can unlock solutions. Maranas used this approach to design microbial strains requiring up to 10 simultaneous gene knockouts, a result once considered impossible that CRISPR now executes in an afternoon.
  • Computational-Experimental Collaboration Protocol: Productive wet lab partnerships require computational researchers to become genuine domain experts in the experimental partner's organism and methods, not vice versa. Maranas estimates it takes multiple back-and-forth cycles — where computational suggestions are rejected, models are updated, and new suggestions are made — before a collaboration becomes reliably productive. Selecting collaborators for personal compatibility, not just scientific overlap, is equally critical.
  • Top-Down Genome Streamlining: Rather than building minimal cells from scratch, a pragmatic near-term strategy is stripping 10–20% of dispensable DNA from proven production strains like E. coli or yeast. Removing non-functional genomic segments reduces replication burden and eliminates metabolic pathways that could accidentally activate and divert carbon flux away from the target product, improving both predictability and yield in bioreactor deployments.

What It Covers

Penn State chemical engineering professor Costas Maranas discusses how computational methods — specifically optimization algorithms, biophysical force fields, and emerging transformer models — can engineer proteins, enzymes, and microbial strains to perform functions nature never evolved them to do, and why data quality remains the central bottleneck.

Key Questions Answered

  • Inverse Protein Design: The core unsolved challenge in computational protein engineering is the inverse folding problem — given a desired protein structure and function, determine which amino acid sequence produces it. Current biophysical force fields like AMBER and CHARMM carry significant uncertainty, meaning even powerful search algorithms succeed only a fraction of the time, far below the 20–40% hit rate that would make wet lab screening practical.
  • Negative Data Generation: Training machine learning models on protein function requires balanced datasets of both working and non-working variants, yet most published datasets contain only positive results. Maranas advocates for a moonshot-style initiative: systematically engineer hundreds of enzymes spanning diverse EC classifications across prokaryotic, eukaryotic, and archaeal organisms, generating unbiased positive and negative variant data to properly train protein language models.
  • Mathematical Reframing Over Raw Compute: When computational limits block biological design problems, reframing them using established mathematical structures — such as mixed-integer linear programming borrowed from airline scheduling and warehouse logistics — can unlock solutions. Maranas used this approach to design microbial strains requiring up to 10 simultaneous gene knockouts, a result once considered impossible that CRISPR now executes in an afternoon.
  • Computational-Experimental Collaboration Protocol: Productive wet lab partnerships require computational researchers to become genuine domain experts in the experimental partner's organism and methods, not vice versa. Maranas estimates it takes multiple back-and-forth cycles — where computational suggestions are rejected, models are updated, and new suggestions are made — before a collaboration becomes reliably productive. Selecting collaborators for personal compatibility, not just scientific overlap, is equally critical.
  • Top-Down Genome Streamlining: Rather than building minimal cells from scratch, a pragmatic near-term strategy is stripping 10–20% of dispensable DNA from proven production strains like E. coli or yeast. Removing non-functional genomic segments reduces replication burden and eliminates metabolic pathways that could accidentally activate and divert carbon flux away from the target product, improving both predictability and yield in bioreactor deployments.

Notable Moment

Maranas describes attending an operations research conference in the late 1990s where genome assembly researchers presented to a room of mathematicians who were entirely disengaged. Recognizing that his cross-disciplinary background uniquely positioned him to bridge that gap became the moment he committed to redirecting his entire lab toward computational biology.

Know someone who'd find this useful?

You just read a 3-minute summary of a 46-minute episode.

Get Axial Podcast summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Keep Reading

More from Axial Podcast

We summarize every new episode. Want them in your inbox?

Similar Episodes

Related episodes from other podcasts

Explore Related Topics

This podcast is featured in Best Biotech Podcasts (2026) — ranked and reviewed with AI summaries.

You're clearly into Axial Podcast.

Every Monday, we deliver AI summaries of the latest episodes from Axial Podcast and 192+ other podcasts. Free for up to 3 shows.

Start My Monday Digest

No credit card · Unsubscribe anytime