Jeff Beck

VAEs Are Energy-Based Models? [Dr. Jeff Beck]

Jan 25, 202647 minGuest

AI Summary

→ WHAT IT COVERS Dr. Jeff Beck explores energy-based models, variational autoencoders, and the nature of agency in AI systems. The conversation covers geometric deep learning, Bayesian inference, self-supervised learning architectures like JEPA, continual learning challenges, and the future of autonomous AI systems capable of scientific discovery and experimental design. → KEY INSIGHTS - **Energy-Based Models vs Neural Networks:** Energy-based models differ from traditional feedforward networks by applying cost functions to internal states, not just inputs and outputs. This requires two minimizations: one for the energetic minimum of hidden nodes and one for prediction error. Variational autoencoders exemplify this approach, with encoders, decoders, and cost functions operating on internal representations like Gaussian distributions. - **Agency Identification Problem:** Determining whether a system exhibits true agency versus sophisticated policy execution requires examining internal computations, not just observing behavior. An agent performing Monte Carlo tree search and planning can appear identical to a complex function transformation from outside. The practical approach involves measuring internal state sophistication using metrics like transfer entropy to assign degrees of agency. - **Test-Time Training Limitations:** Current test-time training methods train networks in supervised mode, then activate additional weight adjustments during deployment. This approach seems unwise because the original network never learned with those latent variables active during training. Traditional energy-based models optimize latent variables throughout the entire training process, not just at deployment, creating more robust learning. - **Self-Supervised Learning Trade-offs:** Joint embedding prediction architectures compress inputs and outputs into latent spaces for learning, avoiding pixel-level prediction requirements. The challenge is preventing mode collapse where both embeddings become zero. Non-contrastive methods like BYOL and Barlow Twins use various regularization techniques to maintain representation richness while avoiding the expensive negative sampling required by traditional contrastive approaches. - **Continual Learning Requirements:** True artificial intelligence requires systems that instantiate new objects or models when encountering unexpected situations, not just learning from fixed training sets. This involves Bayesian nonparametric approaches with Dirichlet process priors that trigger learning when surprises occur. Object-centered physics discovery enables systems to create brand new object representations autonomously to explain novel situations, combining existing modules in new ways. → NOTABLE MOMENT Beck challenges the assumption that physical embodiment defines agency, arguing a high-fidelity computer simulation of himself would only become an agent if placed in his physical body. He maintains agents must be physical entities, not just computational models, even when the simulated version performs identical calculations and exhibits indistinguishable behavior from outside observation. 💼 SPONSORS [{"name": "Mint Mobile", "url": "mintmobile.com/switch"}, {"name": "Athletic Brewing Company", "url": "athleticbrewing.com"}] 🏷️ Energy-Based Models, Bayesian Inference, Continual Learning, Self-Supervised Learning, AI Agency

Read Full Summary Listen

Featured On 1 Podcast

Machine Learning Street Talk

All Appearances

VAEs Are Energy-Based Models? [Dr. Jeff Beck]

AI Summary

Never miss Jeff Beck's insights