Skip to main content
Eye on AI

#326 Zuzanna Stamirowska: Inside Pathway's AI Systems That Work with Live, Real-Time Data

67 min episode · 2 min read
·

Episode

67 min

Read time

2 min

Topics

Artificial Intelligence, Science & Discovery

AI-Generated Summary

Key Takeaways

  • Memory at inference, not just training: BDH updates its fast weights continuously during inference, meaning the model retains context across a session rather than resetting each time. This directly addresses the core LLM limitation where every session starts from scratch — analogous to an employee who never accumulates experience beyond their first day on the job.
  • State lives on edges, not nodes: Unlike transformers where knowledge encodes in node weights, BDH stores state on synaptic edges (fast weights) while nodes function purely as computations. This duality, drawn from quantum physics principles, allows individual synapses to represent specific concepts — the paper demonstrates a single synapse activating consistently for the concept of currency.
  • Sparse local dynamics reduce compute: BDH uses a graph topology where neurons connect only to relevant neighbors, not all-to-all as in transformers. Attention scales linearly with neuron count n rather than quadratically. At inference, only a small local subgraph activates per step, meaning a model with a massive state accesses only a fraction of it — potentially cutting reasoning compute by 10x per output token.
  • Model merging via graph concatenation: Two separately trained BDH graphs can be joined along the shared neuron dimension n, then fine-tuned together to form cross-domain connections. A paper experiment merges two single-language models, producing coherent mixed-language output. This composability enables domain fusion — combining, for example, a finance-trained and a law-trained model into one integrated reasoning system.
  • Target use cases: small data and long-horizon reasoning: BDH's first commercial applications focus on high-value, data-scarce domains such as nuclear engineering documentation and healthcare claims resolution. The architecture's interpretability advantage — visible synapse activation patterns — supports regulated industries requiring explainability. AWS customers gain access through a Pathway-NVIDIA-AWS partnership announced at AWS re:Invent in December 2024.

What It Covers

Pathway cofounder Zuzanna Stamirowska presents the Dragon Hatchling (BDH) architecture, a post-transformer neural network modeled on brain-like graph dynamics. The system stores state on edges rather than nodes, enables persistent memory at inference time, and trains comparably to GPT-2 while targeting enterprise reasoning tasks requiring small data and long-horizon coherence.

Key Questions Answered

  • Memory at inference, not just training: BDH updates its fast weights continuously during inference, meaning the model retains context across a session rather than resetting each time. This directly addresses the core LLM limitation where every session starts from scratch — analogous to an employee who never accumulates experience beyond their first day on the job.
  • State lives on edges, not nodes: Unlike transformers where knowledge encodes in node weights, BDH stores state on synaptic edges (fast weights) while nodes function purely as computations. This duality, drawn from quantum physics principles, allows individual synapses to represent specific concepts — the paper demonstrates a single synapse activating consistently for the concept of currency.
  • Sparse local dynamics reduce compute: BDH uses a graph topology where neurons connect only to relevant neighbors, not all-to-all as in transformers. Attention scales linearly with neuron count n rather than quadratically. At inference, only a small local subgraph activates per step, meaning a model with a massive state accesses only a fraction of it — potentially cutting reasoning compute by 10x per output token.
  • Model merging via graph concatenation: Two separately trained BDH graphs can be joined along the shared neuron dimension n, then fine-tuned together to form cross-domain connections. A paper experiment merges two single-language models, producing coherent mixed-language output. This composability enables domain fusion — combining, for example, a finance-trained and a law-trained model into one integrated reasoning system.
  • Target use cases: small data and long-horizon reasoning: BDH's first commercial applications focus on high-value, data-scarce domains such as nuclear engineering documentation and healthcare claims resolution. The architecture's interpretability advantage — visible synapse activation patterns — supports regulated industries requiring explainability. AWS customers gain access through a Pathway-NVIDIA-AWS partnership announced at AWS re:Invent in December 2024.
  • Scale-free network structure emerges organically: During training, BDH's connection degree distribution converges toward a scale-free topology without any top-down design. This structure, common in organic complex systems like the internet and social networks, provides resilience, efficient communicability, and self-similar properties across network sizes — properties the team argues will support generalization and reasoning beyond what fixed transformer architectures can achieve.

Notable Moment

Stamirowska describes an experiment where two BDH models, each trained on a different language, were merged by simply concatenating their graphs. Without additional joint training, the combined model began producing output that mixed vocabulary from both languages in a coherent way — demonstrating that architectural composability is a structural property, not an engineering workaround.

Know someone who'd find this useful?

You just read a 3-minute summary of a 64-minute episode.

Get Eye on AI summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Keep Reading

More from Eye on AI

We summarize every new episode. Want them in your inbox?

Similar Episodes

Related episodes from other podcasts

Explore Related Topics

This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.

Read this week's AI & Machine Learning Podcast Insights — cross-podcast analysis updated weekly.

You're clearly into Eye on AI.

Every Monday, we deliver AI summaries of the latest episodes from Eye on AI and 192+ other podcasts. Free for up to 3 shows.

Start My Monday Digest

No credit card · Unsubscribe anytime