AI Summary
→ WHAT IT COVERS Aakanksha Chowdhery from Reflection explains why pretraining language models specifically for agentic capabilities requires rethinking attention mechanisms, loss objectives, and training data composition beyond current post-training approaches that optimize static benchmarks. → KEY INSIGHTS - **Pretraining for agents:** Current models train on static benchmarks like GLUE or GSM8K, but agentic tasks require interactive environment capabilities. Pretraining must fundamentally change attention mechanisms, loss objectives, and data composition, not just rely on post-training fixes to achieve multistep reasoning and tool use. - **Long context reasoning:** Models need attention mechanisms that enable reasoning over millions of tokens while maintaining retrieval and synthesis capabilities. Current transformers struggle with multi-hop reasoning benchmarks like MRCR v2 and LOFT, even with long context windows, requiring architectural modifications for agent workflows. - **Training data augmentation:** Dominant pretraining sources like internet articles must be augmented with reasoning traces at comparable token volumes. Masking specific portions during training, similar to fill-in-the-middle for code models, teaches models which tools to use and how to plan across multiple steps. - **Failure recovery capability:** Models must learn to recognize failed trajectory steps in their context and choose different action spaces rather than repeating probabilistic mistakes. This requires both reinforcement learning objectives and pretraining formats that help models notice and correct from previous errors during multistep problem solving. → NOTABLE MOMENT Chowdhery reveals that halfway through training PaLM, the team tested it on a crowdsourced reasoning benchmark and discovered a sudden step change in performance, indicating emergent reasoning capabilities that would not have been detected without diverse evaluation tasks beyond standard metrics. 💼 SPONSORS [{"name": "Capital One", "url": null}, {"name": "Google", "url": "https://ai.studio/build"}] 🏷️ Pretraining Methods, Agentic AI, Long Context Reasoning, Model Benchmarking