Proteomics and AI with Peter Cimermančič
Episode
57 min
Read time
2 min
Topics
Artificial Intelligence
AI-Generated Summary
Key Takeaways
- ✓The 80% Dark Matter Problem: Current mass spectrometry search algorithms leave up to 80% of measured spectra unidentified, meaning most proteomics-based drug discovery operates on only 20-30% of available data. Researchers should treat preprocessing quality — not downstream analytics — as the primary bottleneck limiting biological discovery and target identification accuracy.
- ✓Cloud Migration as Dual Solution: Moving proteomics preprocessing from single workstations to cloud infrastructure solves two problems simultaneously: parallelizing compute across hundreds of nodes accelerates processing speed, while the added compute capacity enables replacement of legacy rule-based algorithms with transformer and vision-based AI models that capture previously ignored signal.
- ✓Absence of Signal as Data: Current scoring algorithms match spectra using barcode-style peak matching, ignoring peak intensities, non-canonical fragmentation patterns, and crucially, the absence of expected peaks. AI models trained without these human-imposed rules learn from missing signal too, producing 70% more peptide identifications on standard human proteomes and 200-300% gains on complex metaproteomics datasets.
- ✓Expanding the Search Space: Standard proteomics searches only consider canonical proteins longer than 50 amino acids in unmodified form. Deliberately expanding searches to include small open reading frames, post-translational modifications, and sequence variants — enabled by a sufficiently accurate scoring model — reveals biologically relevant proteins that canonical pipelines structurally cannot detect.
- ✓Proteomics Covers the Full Drug Discovery Pipeline: Mass spectrometry proteomics applies across every drug discovery stage: affinity purification identifies protein-protein interactions for target discovery; chemoproteomics maps covalent small-molecule binding sites; immunopeptidomics identifies peptides for cancer vaccines; and plasma proteomics predicts patient treatment response and disease outcomes, consistently outperforming other omics modalities in multiomics studies.
What It Covers
Peter Cimermančič, cofounder of Tesserai and former seven-year Verily researcher, explains how AI-powered preprocessing of mass spectrometry proteomics data can recover up to 80% of currently unidentified spectra, unlocking drug targets and biological insights that conventional search algorithms systematically miss.
Key Questions Answered
- •The 80% Dark Matter Problem: Current mass spectrometry search algorithms leave up to 80% of measured spectra unidentified, meaning most proteomics-based drug discovery operates on only 20-30% of available data. Researchers should treat preprocessing quality — not downstream analytics — as the primary bottleneck limiting biological discovery and target identification accuracy.
- •Cloud Migration as Dual Solution: Moving proteomics preprocessing from single workstations to cloud infrastructure solves two problems simultaneously: parallelizing compute across hundreds of nodes accelerates processing speed, while the added compute capacity enables replacement of legacy rule-based algorithms with transformer and vision-based AI models that capture previously ignored signal.
- •Absence of Signal as Data: Current scoring algorithms match spectra using barcode-style peak matching, ignoring peak intensities, non-canonical fragmentation patterns, and crucially, the absence of expected peaks. AI models trained without these human-imposed rules learn from missing signal too, producing 70% more peptide identifications on standard human proteomes and 200-300% gains on complex metaproteomics datasets.
- •Expanding the Search Space: Standard proteomics searches only consider canonical proteins longer than 50 amino acids in unmodified form. Deliberately expanding searches to include small open reading frames, post-translational modifications, and sequence variants — enabled by a sufficiently accurate scoring model — reveals biologically relevant proteins that canonical pipelines structurally cannot detect.
- •Proteomics Covers the Full Drug Discovery Pipeline: Mass spectrometry proteomics applies across every drug discovery stage: affinity purification identifies protein-protein interactions for target discovery; chemoproteomics maps covalent small-molecule binding sites; immunopeptidomics identifies peptides for cancer vaccines; and plasma proteomics predicts patient treatment response and disease outcomes, consistently outperforming other omics modalities in multiomics studies.
Notable Moment
Cimermančič describes how HIV-human protein interaction studies using mass spectrometry were conducted while seeing only 20% of actual interactions — raising the pointed question of how many viable drug targets for infectious disease have been permanently overlooked due to preprocessing limitations rather than biological absence.
You just read a 3-minute summary of a 54-minute episode.
Get Axial Podcast summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Axial Podcast
Modern Computational Tools for Chemistry with Corin Wagen
Mar 23 · 50 min
Everything Everywhere Daily
Failed Physical Media Formats
May 19
More from Axial Podcast
Evolutionary Intelligence and Biologics Discovery with Jeremy Agresti
Mar 23 · 51 min
Snacks Daily
🛍️ “Neverlane” — Shein buys Everlane. Sama beats Elon. Mr Wonderful’s data center. +Crypto Etiquette Training
May 19
More from Axial Podcast
We summarize every new episode. Want them in your inbox?
Modern Computational Tools for Chemistry with Corin Wagen
Evolutionary Intelligence and Biologics Discovery with Jeremy Agresti
AI Workflows for Biopharma with Alex Telford
AI Legal Software with Scott Stevenson
Scaling Proteomics with Milad Dagher
Similar Episodes
Related episodes from other podcasts
Everything Everywhere Daily
May 19
Failed Physical Media Formats
Snacks Daily
May 19
🛍️ “Neverlane” — Shein buys Everlane. Sama beats Elon. Mr Wonderful’s data center. +Crypto Etiquette Training
This Week in Startups
May 18
Why is Gen Z hates AI?
Marketplace
May 18
AI chips away at cybersecurity job opportunities
The AI Breakdown
May 18
Beating the AI Doom Cycle
Explore Related Topics
This podcast is featured in Best Biotech Podcasts (2026) — ranked and reviewed with AI summaries.
Read this week's AI & Machine Learning Podcast Insights — cross-podcast analysis updated weekly.
You're clearly into Axial Podcast.
Every Monday, we deliver AI summaries of the latest episodes from Axial Podcast and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime