Computational Protein Design with Costas Maranas
Episode
49 min
Read time
2 min
Topics
Design & UX
AI-Generated Summary
Key Takeaways
- ✓Inverse Protein Design: The core unsolved challenge in computational protein engineering is the inverse folding problem — given a desired protein structure and function, determine which amino acid sequence produces it. Current biophysical force fields like AMBER and CHARMM carry significant uncertainty, meaning even powerful search algorithms succeed only a fraction of the time, far below the 20–40% hit rate that would make wet lab screening practical.
- ✓Negative Data Generation: Training machine learning models on protein function requires balanced datasets of both working and non-working variants, yet most published datasets contain only positive results. Maranas advocates for a moonshot-style initiative: systematically engineer hundreds of enzymes spanning diverse EC classifications across prokaryotic, eukaryotic, and archaeal organisms, generating unbiased positive and negative variant data to properly train protein language models.
- ✓Mathematical Reframing Over Raw Compute: When computational limits block biological design problems, reframing them using established mathematical structures — such as mixed-integer linear programming borrowed from airline scheduling and warehouse logistics — can unlock solutions. Maranas used this approach to design microbial strains requiring up to 10 simultaneous gene knockouts, a result once considered impossible that CRISPR now executes in an afternoon.
- ✓Computational-Experimental Collaboration Protocol: Productive wet lab partnerships require computational researchers to become genuine domain experts in the experimental partner's organism and methods, not vice versa. Maranas estimates it takes multiple back-and-forth cycles — where computational suggestions are rejected, models are updated, and new suggestions are made — before a collaboration becomes reliably productive. Selecting collaborators for personal compatibility, not just scientific overlap, is equally critical.
- ✓Top-Down Genome Streamlining: Rather than building minimal cells from scratch, a pragmatic near-term strategy is stripping 10–20% of dispensable DNA from proven production strains like E. coli or yeast. Removing non-functional genomic segments reduces replication burden and eliminates metabolic pathways that could accidentally activate and divert carbon flux away from the target product, improving both predictability and yield in bioreactor deployments.
What It Covers
Penn State chemical engineering professor Costas Maranas discusses how computational methods — specifically optimization algorithms, biophysical force fields, and emerging transformer models — can engineer proteins, enzymes, and microbial strains to perform functions nature never evolved them to do, and why data quality remains the central bottleneck.
Key Questions Answered
- •Inverse Protein Design: The core unsolved challenge in computational protein engineering is the inverse folding problem — given a desired protein structure and function, determine which amino acid sequence produces it. Current biophysical force fields like AMBER and CHARMM carry significant uncertainty, meaning even powerful search algorithms succeed only a fraction of the time, far below the 20–40% hit rate that would make wet lab screening practical.
- •Negative Data Generation: Training machine learning models on protein function requires balanced datasets of both working and non-working variants, yet most published datasets contain only positive results. Maranas advocates for a moonshot-style initiative: systematically engineer hundreds of enzymes spanning diverse EC classifications across prokaryotic, eukaryotic, and archaeal organisms, generating unbiased positive and negative variant data to properly train protein language models.
- •Mathematical Reframing Over Raw Compute: When computational limits block biological design problems, reframing them using established mathematical structures — such as mixed-integer linear programming borrowed from airline scheduling and warehouse logistics — can unlock solutions. Maranas used this approach to design microbial strains requiring up to 10 simultaneous gene knockouts, a result once considered impossible that CRISPR now executes in an afternoon.
- •Computational-Experimental Collaboration Protocol: Productive wet lab partnerships require computational researchers to become genuine domain experts in the experimental partner's organism and methods, not vice versa. Maranas estimates it takes multiple back-and-forth cycles — where computational suggestions are rejected, models are updated, and new suggestions are made — before a collaboration becomes reliably productive. Selecting collaborators for personal compatibility, not just scientific overlap, is equally critical.
- •Top-Down Genome Streamlining: Rather than building minimal cells from scratch, a pragmatic near-term strategy is stripping 10–20% of dispensable DNA from proven production strains like E. coli or yeast. Removing non-functional genomic segments reduces replication burden and eliminates metabolic pathways that could accidentally activate and divert carbon flux away from the target product, improving both predictability and yield in bioreactor deployments.
Notable Moment
Maranas describes attending an operations research conference in the late 1990s where genome assembly researchers presented to a room of mathematicians who were entirely disengaged. Recognizing that his cross-disciplinary background uniquely positioned him to bridge that gap became the moment he committed to redirecting his entire lab toward computational biology.
You just read a 3-minute summary of a 46-minute episode.
Get Axial Podcast summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Axial Podcast
Modern Computational Tools for Chemistry with Corin Wagen
Mar 23 · 50 min
20VC (20 Minute VC)
20VC: Lessons from Jensen Huang on "Founder Mode" | How to Know if OpenAI or Anthropic Will Kill your Company | How USV Liking Music Made Them $1BN on an Investment | The Five Year Desert to Product Market Fit & a $5.3BN Valuation with Shiv Rao @ Abridge
May 16
More from Axial Podcast
Evolutionary Intelligence and Biologics Discovery with Jeremy Agresti
Mar 23 · 51 min
Modern Wisdom
The Health Crisis Of Office Jobs - Bob King - #1098
May 16
More from Axial Podcast
We summarize every new episode. Want them in your inbox?
Modern Computational Tools for Chemistry with Corin Wagen
Evolutionary Intelligence and Biologics Discovery with Jeremy Agresti
AI Workflows for Biopharma with Alex Telford
AI Legal Software with Scott Stevenson
Scaling Proteomics with Milad Dagher
Similar Episodes
Related episodes from other podcasts
20VC (20 Minute VC)
May 16
20VC: Lessons from Jensen Huang on "Founder Mode" | How to Know if OpenAI or Anthropic Will Kill your Company | How USV Liking Music Made Them $1BN on an Investment | The Five Year Desert to Product Market Fit & a $5.3BN Valuation with Shiv Rao @ Abridge
Modern Wisdom
May 16
The Health Crisis Of Office Jobs - Bob King - #1098
Mind Pump: Raw Fitness Truth
May 16
2859: Take a Week Off and Gain 21% More Muscle — Here's the Science
All-In with Chamath, Jason, Sacks & Friedberg
May 15
Trump-Xi Summit, Benioff: "Not My First SaaSpocalypse," OpenAI vs Apple, Multi-Sensory AI, El Niño
So Money with Farnoosh Torabi
May 15
1983: Ask Farnoosh: 529 Advice, College Saving Strategies and Can AI Provide Financial Advice?
Explore Related Topics
This podcast is featured in Best Biotech Podcasts (2026) — ranked and reviewed with AI summaries.
You're clearly into Axial Podcast.
Every Monday, we deliver AI summaries of the latest episodes from Axial Podcast and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime