Intelligence on the Edge: Liquid AI's Ramin Hasani on the Search for Device-Native Foundation Models
Episode
107 min
Read time
3 min
Topics
Startups, Fundraising & VC, Leadership
AI-Generated Summary
Key Takeaways
- ✓Architecture scaling gradient: The optimal neural network architecture shifts predictably with model size. Below roughly 100 billion parameters, adding structural biases — gating mechanisms, recurrence, convolutions — improves performance for specialized tasks. Above that threshold, unstructured operators like pure matrix multiplication outperform biased alternatives. Practitioners building small, domain-specific models should actively explore gated recurrent and convolutional hybrids rather than defaulting to transformer-only architectures, which only dominate at maximum scale.
- ✓Automated Foundation Model Design (AFMD): Liquid AI's internal architecture search system evaluates 50–100 candidate operators using evolutionary strategies with actual target hardware in the loop. Critically, it optimizes against real downstream task performance across 100-plus benchmarks — not proxy metrics like perplexity. Teams building specialized models should replicate this principle: benchmark on the exact hardware and exact task the model will serve, not on general leaderboard proxies that frequently mislead architectural decisions.
- ✓LFM-2 architecture outcome: Running AFMD on CPU-class hardware consistently surfaces a double-gated 1D convolution as the dominant operator, comprising 70–80% of the resulting network layers, with a reduced number of attention layers filling the remainder. This hybrid achieves competitive quality while dramatically reducing memory footprint and latency versus pure transformer equivalents. The key retained element is input-dependent gating — not the full complexity of state space models — suggesting gating alone captures most of the representational benefit.
- ✓Edge AI market sizing: The global smartphone market generates approximately 500 billion dollars annually, and the laptop market adds another 300–400 billion, totaling roughly one trillion dollars in annual device compute shipments. This substrate currently runs minimal on-device AI inference. A 600-megabyte audio-visual model now powers Mercedes-Benz in-car voice interaction, demonstrating that production-grade multimodal AI can fit within automotive-grade processor constraints today, not as a future projection.
- ✓Input-dependent dynamics as core principle: Liquid AI traces its architectural philosophy to a 2022 paper introducing input-dependent state space models (Liquid S4), predating Mamba by roughly 18 months. Input dependence means the network's transformation parameters shift based on the current input during the forward pass, while the backward pass learns the dynamics of that adaptation. This second axis — learned dynamics separate from parameter count — allows smaller models to compress more knowledge than parameter count alone would predict.
What It Covers
Liquid AI CEO Ramin Hasani explains how his MIT-founded company builds device-native foundation models using automated architecture search, biological inspiration from C. elegans neural dynamics, and hardware-in-the-loop optimization. The company holds the number five spot on Hugging Face US downloads with over one million weekly downloads, targeting the roughly one trillion dollar annual smartphone and laptop market with sub-cloud AI inference.
Key Questions Answered
- •Architecture scaling gradient: The optimal neural network architecture shifts predictably with model size. Below roughly 100 billion parameters, adding structural biases — gating mechanisms, recurrence, convolutions — improves performance for specialized tasks. Above that threshold, unstructured operators like pure matrix multiplication outperform biased alternatives. Practitioners building small, domain-specific models should actively explore gated recurrent and convolutional hybrids rather than defaulting to transformer-only architectures, which only dominate at maximum scale.
- •Automated Foundation Model Design (AFMD): Liquid AI's internal architecture search system evaluates 50–100 candidate operators using evolutionary strategies with actual target hardware in the loop. Critically, it optimizes against real downstream task performance across 100-plus benchmarks — not proxy metrics like perplexity. Teams building specialized models should replicate this principle: benchmark on the exact hardware and exact task the model will serve, not on general leaderboard proxies that frequently mislead architectural decisions.
- •LFM-2 architecture outcome: Running AFMD on CPU-class hardware consistently surfaces a double-gated 1D convolution as the dominant operator, comprising 70–80% of the resulting network layers, with a reduced number of attention layers filling the remainder. This hybrid achieves competitive quality while dramatically reducing memory footprint and latency versus pure transformer equivalents. The key retained element is input-dependent gating — not the full complexity of state space models — suggesting gating alone captures most of the representational benefit.
- •Edge AI market sizing: The global smartphone market generates approximately 500 billion dollars annually, and the laptop market adds another 300–400 billion, totaling roughly one trillion dollars in annual device compute shipments. This substrate currently runs minimal on-device AI inference. A 600-megabyte audio-visual model now powers Mercedes-Benz in-car voice interaction, demonstrating that production-grade multimodal AI can fit within automotive-grade processor constraints today, not as a future projection.
- •Input-dependent dynamics as core principle: Liquid AI traces its architectural philosophy to a 2022 paper introducing input-dependent state space models (Liquid S4), predating Mamba by roughly 18 months. Input dependence means the network's transformation parameters shift based on the current input during the forward pass, while the backward pass learns the dynamics of that adaptation. This second axis — learned dynamics separate from parameter count — allows smaller models to compress more knowledge than parameter count alone would predict.
- •Closed-form solution unlocking scale: Liquid neural networks were originally governed by differential equations with no known closed-form solution since 1907 (Lapicque's membrane potential equation). Solving these in closed form, published in Nature Machine Intelligence in November 2022, eliminated the need for numerical solvers and enabled scaling from thousands to billions of neurons. The practical implication: biologically-grounded nonlinear architectures are no longer computationally intractable and can now be trained at foundation model scale on standard GPU clusters.
- •Fine-tuning platform incoming: Liquid AI plans to release a self-serve platform allowing enterprise customers to fine-tune small, task-specific models on their own data without requiring direct engineering engagement. For practitioners with narrow, well-defined use cases — document classification, catalog search, sensor prediction — this pathway offers a route to frontier-competitive performance at a fraction of inference cost, by combining a pre-optimized architecture with domain-specific supervised fine-tuning on proprietary datasets rather than relying on general-purpose cloud models.
Notable Moment
Hasani describes parking a car autonomously using a control module containing just 12 liquid neurons — not 12 million, not 12,000, but 12. Flying a drone required 30. The result came from modeling C. elegans worm neural dynamics, the only animal whose complete 300-cell nervous system was fully mapped, demonstrating that architectural expressiveness per neuron can substitute for raw parameter count in closed-loop control tasks.
You just read a 3-minute summary of a 104-minute episode.
Get Cognitive Revolution summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Cognitive Revolution
1000 Designs a Day: Neural Concept's Thomas von Tschammer on AI-Native Engineering
Jul 1 · 89 min
a16z Podcast
Building Search for AI Agents with Exa CEO Will Bryk
Jun 6
More from Cognitive Revolution
AI:AM #4: Cameron on Model Consciousness, Duvenaud's Gradual Disempowerment, swyx's AI-Eng Alpha
Jun 27 · 116 min
Eye on AI
How AI Is Reinventing Elder Care | Chia-Lin Simmons of LogicMark
Jun 1
More from Cognitive Revolution
We summarize every new episode. Want them in your inbox?
1000 Designs a Day: Neural Concept's Thomas von Tschammer on AI-Native Engineering
AI:AM #4: Cameron on Model Consciousness, Duvenaud's Gradual Disempowerment, swyx's AI-Eng Alpha
The God We Deserve: Nonzero's Robert Wright on AI as Humanity's Ultimate Test
AI:AM #3: Zvi on Fable, the Cases For & Against the Ban, + AI for Math, Logistics & More
Dean Ball, on Joining OpenAI: New Power Centers, Frontier AI Policy, & Main Character Energy
Similar Episodes
Related episodes from other podcasts
a16z Podcast
Jun 6
Building Search for AI Agents with Exa CEO Will Bryk
Eye on AI
Jun 1
How AI Is Reinventing Elder Care | Chia-Lin Simmons of LogicMark
In Good Company with Nicolai Tangen
May 29
HIGHLIGHTS: Fabricio Bloisi - CEO of Prosus
Eye on AI
Mar 31
#329 Izhar Medalsy: How AI Solves Quantum Computing's Biggest Problem
Practical AI
Mar 25
AI at the Edge is a different operating environment
Explore Related Topics
This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.
Read this week's Startups & Product Podcast Insights — cross-podcast analysis updated weekly.
You're clearly into Cognitive Revolution.
Every Monday, we deliver AI summaries of the latest episodes from Cognitive Revolution and 192+ other podcasts. Free for one show.
Start My Monday DigestNo credit card · Unsubscribe anytime