Making deep learning perform real algorithms with Category Theory (Andrew Dudzik, Petar Velichkovich, Taco Cohen, Bruno Gavranović, Paul Lessard)
Episode
43 min
Read time
2 min
Topics
Fundraising & VC, Design & UX, Artificial Intelligence
AI-Generated Summary
Key Takeaways
- ✓Algorithmic Failure in LLMs: Large language models perform hundreds of billions of multiplications to generate single tokens yet cannot reliably multiply small numbers together, revealing misalignment between training methods and downstream reasoning tasks requiring correctness guarantees.
- ✓Beyond Geometric Deep Learning: Group theory handles spatial symmetries but fails for non-invertible computations like Dijkstra's algorithm where multiple different graphs compress to identical outputs, requiring category theory's broader framework to model information-destroying transformations in algorithmic reasoning.
- ✓Two-Category Weight Tying: Two-morphisms in categorical frameworks formalize when weight sharing is mathematically valid across network layers, enabling provable correctness for parameter sharing beyond simple copying, creating systematic architecture design principles rather than ad-hoc engineering choices.
- ✓Carry Mechanism Challenge: Implementing arithmetic carries in continuous gradient-based systems requires modeling state changes rather than states themselves, a fundamental operation from CPU design that current graph neural networks struggle to represent, potentially solvable through geometric structures like Hopf fibrations.
What It Covers
Category theory provides a mathematical framework for designing neural networks that can reliably execute algorithms like addition and multiplication, addressing fundamental limitations in current large language models and deep learning architectures.
Key Questions Answered
- •Algorithmic Failure in LLMs: Large language models perform hundreds of billions of multiplications to generate single tokens yet cannot reliably multiply small numbers together, revealing misalignment between training methods and downstream reasoning tasks requiring correctness guarantees.
- •Beyond Geometric Deep Learning: Group theory handles spatial symmetries but fails for non-invertible computations like Dijkstra's algorithm where multiple different graphs compress to identical outputs, requiring category theory's broader framework to model information-destroying transformations in algorithmic reasoning.
- •Two-Category Weight Tying: Two-morphisms in categorical frameworks formalize when weight sharing is mathematically valid across network layers, enabling provable correctness for parameter sharing beyond simple copying, creating systematic architecture design principles rather than ad-hoc engineering choices.
- •Carry Mechanism Challenge: Implementing arithmetic carries in continuous gradient-based systems requires modeling state changes rather than states themselves, a fundamental operation from CPU design that current graph neural networks struggle to represent, potentially solvable through geometric structures like Hopf fibrations.
Notable Moment
The discussion reveals that asking neural networks to both translate messy real-world scenarios into structured representations and robustly execute algorithms on fixed computational budgets creates an impossible burden, suggesting future systems need explicit separation between world understanding and algorithmic reasoning components.
You just read a 3-minute summary of a 40-minute episode.
Get Machine Learning Street Talk summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Machine Learning Street Talk
When AI Decides You're a Threat — Brad Carson
May 31 · 80 min
Sean Carroll's Mindscape
343 | Tom Griffiths on The Laws of Thought
Feb 9
More from Machine Learning Street Talk
Intelligence is collective, not artificial — Prof. Michael I. Jordan (UC Berkeley / Inria)
May 21 · 77 min
My First Million
25% Of My Portfolio Is One Overvalued Stock, Here's Why
Apr 22
More from Machine Learning Street Talk
We summarize every new episode. Want them in your inbox?
When AI Decides You're a Threat — Brad Carson
Intelligence is collective, not artificial — Prof. Michael I. Jordan (UC Berkeley / Inria)
The AI Models Smart Enough to Know They're Cheating — Beth Barnes & David Rein [METR]
When AI Discovers The Next Transformer - Robert Lange (Sakana)
"Vibe Coding is a Slot Machine" - Jeremy Howard
Similar Episodes
Related episodes from other podcasts
Sean Carroll's Mindscape
Feb 9
343 | Tom Griffiths on The Laws of Thought
My First Million
Apr 22
25% Of My Portfolio Is One Overvalued Stock, Here's Why
Radiolab
Dec 12
The Alien in the Room
Sean Carroll's Mindscape
Nov 24
336 | Anil Ananthaswamy on the Mathematics of Neural Nets and AI
Huberman Lab
Jun 1
Peptides: The Science, Uses & Safety | Dr. Abud Bakri
Explore Related Topics
This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.
Read this week's AI & Machine Learning Podcast Insights — cross-podcast analysis updated weekly.
You're clearly into Machine Learning Street Talk.
Every Monday, we deliver AI summaries of the latest episodes from Machine Learning Street Talk and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime