Skip to main content

This Week's Recap

1 episode · Apr 13 – Apr 19

Recent Episode Summaries

16 AI-powered summaries available

54 min episode3 min read

→ WHAT IT COVERS Rashmi Shetty, Senior Director of Enterprise Generative AI Platform at Capital One, explains how the company built and deployed Chat Concierge, a multi-agent car-buying system, and outlines the platform strategy enabling developers to build governed agentic systems at scale across the enterprise. → KEY INSIGHTS - **Multi-agent trigger criteria:** Deploy multi-agent architecture only when a problem contains multiple distinct user intents that cannot be resolved by a single...

63 min episode3 min read

→ WHAT IT COVERS Stefano Ermon, Stanford professor and Inception CEO, explains how diffusion language models work as an alternative to autoregressive LLMs, covering the technical path from image diffusion to text generation, Mercury 2's benchmark performance against frontier speed-optimized models, and why inference-time economics now favor the diffusion approach.

76 min episode3 min read

→ WHAT IT COVERS Blitzy CTO Siddhant Pardeshi explains how his company achieves autonomous software development at enterprise scale using agent swarms, knowledge graphs, and database-driven orchestration. The system writes millions of lines of validated, compiled, tested code autonomously, completing roughly 80% of development work in a single run across large production codebases.

78 min episode3 min read

→ WHAT IT COVERS Sebastian Raschka, independent LLM researcher, joins Sam Charrington to assess the LLM landscape in early 2026. They cover reasoning model advances, inference-time scaling techniques, the rise of agentic tools like OpenClaw, practical workflow automation using LLMs, and what to expect from post-training research through the rest of 2026. → KEY INSIGHTS - **Post-training vs.

66 min episode3 min read

→ WHAT IT COVERS Yejin Choi, professor at Stanford HAI, explores democratizing AI through small language models that match larger counterparts. She details synthetic data generation techniques, reinforcement learning during pretraining, and pluralistic alignment approaches. The conversation examines mode collapse in LLMs, the artificial hive mind phenomenon, and how academic research can make powerful AI accessible beyond resource-rich tech companies.

66 min episode3 min read

→ WHAT IT COVERS Nikita Rudin, CEO of Flexion Robotics, explains the gap between robotics demos and real-world deployment, covering simulation-to-reality challenges, reinforcement learning techniques, and why no humanoid robot generates actual economic value today in 2025. → KEY INSIGHTS - **Sim-to-Real Gap:** Closing the simulation-to-reality gap requires deep understanding of both worlds, mapping every software layer from high-level commands down to motor currents.

52 min episode3 min read

→ WHAT IT COVERS Aakanksha Chowdhery from Reflection explains why pretraining language models specifically for agentic capabilities requires rethinking attention mechanisms, loss objectives, and training data composition beyond current post-training approaches that optimize static benchmarks. → KEY INSIGHTS - **Pretraining for agents:** Current models train on static benchmarks like GLUE or GSM8K, but agentic tasks require interactive environment capabilities.

57 min episode3 min read

→ WHAT IT COVERS Munawar Hayat from Qualcomm AI Research discusses three NeurIPS papers addressing critical failures in vision language models: why they ignore visual input, physics-based generation limitations, and multi-person image generation challenges with proposed solutions. → KEY INSIGHTS - **Vision Token Attention Failure:** Vision language models attend poorly to visual tokens despite having images as input.

48 min episode3 min read

→ WHAT IT COVERS Zain Asgar explains how Gimlet Labs optimizes AI inference costs through heterogeneous compute orchestration, using workload disaggregation, MLIR compilation, and LLM-generated kernel optimization across NVIDIA, AMD, and Intel hardware platforms. → KEY INSIGHTS - **Workload Disaggregation Strategy:** Gimlet splits agent workflows into granular components, assigns performance-critical pieces to premium hardware like B200s, and offloads less critical tasks to lower-cost...

56 min episode3 min read

→ WHAT IT COVERS Devi Parikh, co-founder of Utori, explains how AI browser agents will replace manual web interactions through proactive monitoring and automation, starting with Scouts, their product that monitors websites for user-specified information changes. → KEY INSIGHTS - **Visual-based browser navigation:** Training models on website screenshots rather than DOM information proves more reliable and generalizable across different sites, solving challenges like date pickers that plagued...

54 min episode3 min read

→ WHAT IT COVERS Robin Braun from HPE and Luke Norris from Kamiwaza discuss deploying AI orchestration for smart city operations in Vail, Colorado, focusing on back-office automation, website accessibility compliance, and deed restriction management using private infrastructure. → KEY INSIGHTS - **Back-office automation priority:** Fortune 500 companies and municipalities achieve fastest ROI by automating finance, HR, and procurement workflows first rather than customer-facing chatbots,...

55 min episode3 min read

→ WHAT IT COVERS Carina Hong, founder of Acxiom, explains building AI mathematicians through formal verification using Lean programming language, combining auto-formalization, theorem proving, and self-play systems to achieve mathematical reasoning with provable guarantees. → KEY INSIGHTS - **Data Scarcity Challenge:** Formal math has only 10 million Lean tokens versus one trillion Python tokens, creating a 100,000x data gap that requires auto-formalization and synthetic generation to bridge...

52 min episode3 min read

→ WHAT IT COVERS Hung Bui explains how VinAI Research achieved efficient on-device AI by training smaller models that match larger model performance, developing one-step diffusion for real-time image generation, and building Vietnam's top AI research lab. → KEY INSIGHTS - **Model Size Reduction:** A sub-4-billion parameter Vietnamese language model outperformed the original 7-billion parameter version by iterating over the same dataset multiple times during training and applying minor...

72 min episode3 min read

→ WHAT IT COVERS Alexandre Pesant, AI lead at Lovable, discusses vibe coding's evolution from GPT Engineer, scaling challenges reaching $100M ARR in eight months, the technical architecture behind AI-assisted development, and why nontechnical users can learn software building skills. → KEY INSIGHTS - **Vibe Coding Progression:** Users achieve better results by planning in chat mode before implementation, thinking through sequencing and architecture upfront, knowing when to stop failed attempts...

57 min episode3 min read

→ WHAT IT COVERS Kunle Olukotun explains how SambaNova's reconfigurable dataflow architecture achieves 5-10x better performance per watt for AI inference by eliminating instruction fetching, maximizing memory bandwidth utilization, and enabling microsecond model switching across trillion-parameter systems. → KEY INSIGHTS - **Dataflow vs Instructions:** Reconfigurable dataflow architectures configure hardware to match PyTorch computation graphs rather than fetching instructions each cycle, using...

57 min episode3 min read

→ WHAT IT COVERS Jacob Buckman explains power retention architecture for transformers, combining recurrence and attention to achieve linear scaling for long context processing while maintaining computational efficiency through balanced weight-state FLOP ratios and chunked algorithms. → KEY INSIGHTS - **State Size Balance:** Transformers have states 100,000x larger than LSTMs at long context, while RNNs have states too small.

Monday morning, inbox, done.

Pick your shows, and start the week knowing what happened in your world.

1

Pick the Podcasts You Care About

Choose from 200+ curated shows or add any public RSS feed.

2

AI Reads Every New Episode

Key arguments, surprising data points, and frameworks worth stealing — pulled automatically.

3

One Email, Every Monday

A curated brief for each episode, with links to listen if something grabs you.

Explore More

Get a free sample digest

See what your Monday email looks like — real AI summaries, no account needed.

One free sample — no spam, no commitment.