Skip to main content
VM

Vishal Misra

1episode
1podcast

We have 1 summarized appearance for Vishal Misra so far. Browse all podcasts to discover more episodes.

Featured On 1 Podcast

All Appearances

1 episode
a16z Podcast

What's Missing Between LLMs and AGI - Vishal Misra & Martin Casado

a16z Podcast
48 minProfessor and Vice Dean of Computing and AI at Columbia University

AI Summary

→ WHAT IT COVERS Columbia University professor Vishal Misra presents mathematical proof that transformers perform precise Bayesian inference, matching theoretically correct posteriors to 10⁻³ bit accuracy. He argues two unsolved problems — continual learning plasticity and moving from correlation to causation — separate current LLMs from genuine artificial general intelligence. → KEY INSIGHTS - **Bayesian Wind Tunnel methodology:** To prove LLMs perform true Bayesian inference rather than superficial pattern matching, Misra's team created controlled experiments using blank architectures trained on tasks mathematically impossible to memorize. Transformers matched the analytically calculated Bayesian posterior to 10⁻³ bit accuracy. Mamba performed nearly as well; LSTMs partially; MLPs failed entirely. Architecture, not training data, determines this capability. - **The Frozen Weights Problem:** LLMs perform Bayesian updating within a conversation but reset completely when a new session begins — weights are frozen post-training. Human brains maintain synaptic plasticity throughout life, continuously updating from experience. Continual learning research must solve catastrophic forgetting: updating weights on new information without erasing previously learned knowledge before plasticity becomes viable. - **Shannon Entropy vs. Kolmogorov Complexity:** LLMs operate in the Shannon entropy domain — learning correlations across all available data. Human reasoning operates closer to Kolmogorov complexity — finding the shortest causal program that explains observations. Einstein's field equation (Gμν = 8πTμν) is a minimal representation explaining Mercury's orbit, gravitational lensing, and GPS simultaneously. LLMs cannot generate equivalent new representations. - **The Einstein AGI Test:** A concrete benchmark for AGI: train an LLM exclusively on pre-1911 physics data and determine whether it independently derives the theory of relativity. Current models would fail because they are bound to existing data manifolds and cannot construct new causal representations that reconcile anomalous observations like Michelson-Morley experiment results with Newtonian mechanics. - **Causation vs. Correlation as the Core Gap:** Deep learning performs association — the first tier of Judea Pearl's causal hierarchy. It does not perform intervention or counterfactual reasoning, which require internal simulation models. When a person dodges a thrown object, the brain runs a causal simulation, not a probability calculation. Building architectures capable of causal modeling, not scaling existing ones, is the necessary research direction. → NOTABLE MOMENT Misra describes Donald Knuth's viral Hamiltonian cycle result as validation of LLM limits rather than evidence of emerging generality — the models exhausted their search space and stalled, while Knuth himself constructed the novel mathematical proof, demonstrating that humans still supply the causal reasoning layer. 💼 SPONSORS None detected 🏷️ Large Language Models, Bayesian Inference, Artificial General Intelligence, Causal Reasoning, Continual Learning

Never miss Vishal Misra's insights

Subscribe to get AI-powered summaries of Vishal Misra's podcast appearances delivered to your inbox weekly.

Start Free Today

No credit card required • Free tier available