#326 Zuzanna Stamirowska: Inside Pathway's AI Systems That Work with Live, Real-Time Data

March 11, 2026

67 min episode · 2 min read

Zuzanna Stamirowska

Episode

67 min

Read time

2 min

Topics

Artificial Intelligence, Science & Discovery

AI-Generated Summary

Published Mar 12, 2026

Key Takeaways

✓Memory at inference, not just training: BDH updates its fast weights continuously during inference, meaning the model retains context across a session rather than resetting each time. This directly addresses the core LLM limitation where every session starts from scratch — analogous to an employee who never accumulates experience beyond their first day on the job.
✓State lives on edges, not nodes: Unlike transformers where knowledge encodes in node weights, BDH stores state on synaptic edges (fast weights) while nodes function purely as computations. This duality, drawn from quantum physics principles, allows individual synapses to represent specific concepts — the paper demonstrates a single synapse activating consistently for the concept of currency.
✓Sparse local dynamics reduce compute: BDH uses a graph topology where neurons connect only to relevant neighbors, not all-to-all as in transformers. Attention scales linearly with neuron count n rather than quadratically. At inference, only a small local subgraph activates per step, meaning a model with a massive state accesses only a fraction of it — potentially cutting reasoning compute by 10x per output token.
✓Model merging via graph concatenation: Two separately trained BDH graphs can be joined along the shared neuron dimension n, then fine-tuned together to form cross-domain connections. A paper experiment merges two single-language models, producing coherent mixed-language output. This composability enables domain fusion — combining, for example, a finance-trained and a law-trained model into one integrated reasoning system.
✓Target use cases: small data and long-horizon reasoning: BDH's first commercial applications focus on high-value, data-scarce domains such as nuclear engineering documentation and healthcare claims resolution. The architecture's interpretability advantage — visible synapse activation patterns — supports regulated industries requiring explainability. AWS customers gain access through a Pathway-NVIDIA-AWS partnership announced at AWS re:Invent in December 2024.

What It Covers

Pathway cofounder Zuzanna Stamirowska presents the Dragon Hatchling (BDH) architecture, a post-transformer neural network modeled on brain-like graph dynamics. The system stores state on edges rather than nodes, enables persistent memory at inference time, and trains comparably to GPT-2 while targeting enterprise reasoning tasks requiring small data and long-horizon coherence.

Key Questions Answered

•Memory at inference, not just training: BDH updates its fast weights continuously during inference, meaning the model retains context across a session rather than resetting each time. This directly addresses the core LLM limitation where every session starts from scratch — analogous to an employee who never accumulates experience beyond their first day on the job.
•State lives on edges, not nodes: Unlike transformers where knowledge encodes in node weights, BDH stores state on synaptic edges (fast weights) while nodes function purely as computations. This duality, drawn from quantum physics principles, allows individual synapses to represent specific concepts — the paper demonstrates a single synapse activating consistently for the concept of currency.
•Sparse local dynamics reduce compute: BDH uses a graph topology where neurons connect only to relevant neighbors, not all-to-all as in transformers. Attention scales linearly with neuron count n rather than quadratically. At inference, only a small local subgraph activates per step, meaning a model with a massive state accesses only a fraction of it — potentially cutting reasoning compute by 10x per output token.
•Model merging via graph concatenation: Two separately trained BDH graphs can be joined along the shared neuron dimension n, then fine-tuned together to form cross-domain connections. A paper experiment merges two single-language models, producing coherent mixed-language output. This composability enables domain fusion — combining, for example, a finance-trained and a law-trained model into one integrated reasoning system.
•Target use cases: small data and long-horizon reasoning: BDH's first commercial applications focus on high-value, data-scarce domains such as nuclear engineering documentation and healthcare claims resolution. The architecture's interpretability advantage — visible synapse activation patterns — supports regulated industries requiring explainability. AWS customers gain access through a Pathway-NVIDIA-AWS partnership announced at AWS re:Invent in December 2024.
•Scale-free network structure emerges organically: During training, BDH's connection degree distribution converges toward a scale-free topology without any top-down design. This structure, common in organic complex systems like the internet and social networks, provides resilience, efficient communicability, and self-similar properties across network sizes — properties the team argues will support generalization and reasoning beyond what fixed transformer architectures can achieve.

Notable Moment

Stamirowska describes an experiment where two BDH models, each trained on a different language, were merged by simply concatenating their graphs. Without additional joint training, the combined model began producing output that mixed vocabulary from both languages in a coherent way — demonstrating that architectural composability is a structural property, not an engineering workaround.

Know someone who'd find this useful?

You just read a 3-minute summary of a 64-minute episode.

Get Eye on AI summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Similar Episodes

Related episodes from other podcasts

The Mel Robbins Podcast

Apr 27

685: David Epstein - The Freedom Trap, Narrative Values, General Magic, The Nobel Prize Winner Who Simplified Everything, Wearing the Same Thing Everyday, and Why Constraints Are the Secret to Your Best Work

The AI Breakdown

Apr 26

Where the Economy Thrives After AI

Explore Related Topics

🤖Artificial Intelligence 🔬Science & Discovery

This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.

Read this week's AI & Machine Learning Podcast Insights — cross-podcast analysis updated weekly.

You're clearly into Eye on AI.

Every Monday, we deliver AI summaries of the latest episodes from Eye on AI and 192+ other podcasts. Free for up to 3 shows.

Start My Monday Digest

No credit card · Unsubscribe anytime

#326 Zuzanna Stamirowska: Inside Pathway's AI Systems That Work with Live, Real-Time Data

AI-Generated Summary

Key Takeaways

What It Covers

Key Questions Answered

Notable Moment

Keep Reading

#338 Amith Singhee: Can India Catch Up in AI? IBM's Amith Singhee on What It Will Take

Do THIS Every Day to Rewire Your Brain From Stress and Anxiety

#337 Debdas Sen: Why AI Without ROI Will Die (Again)

The Menopause Gut: Why Metabolism Changes & How to Reclaim Your Body - With Cynthia Thurlow

More from Eye on AI

#338 Amith Singhee: Can India Catch Up in AI? IBM's Amith Singhee on What It Will Take

#337 Debdas Sen: Why AI Without ROI Will Die (Again)

#336 Professor Mausam: Why India Is Losing the AI Race and What It Will Take to Catch Up

#335 Sriram Raghavan: Why IBM Is Betting Everything on Small AI Models

#334 Abhishek Singh: The $1.2 Billion Plan to Turn India Into an AI Superpower