
AI Summary
→ WHAT IT COVERS Eric Zelikman discusses his AI research on reasoning and reinforcement learning at Stanford and XAI, then explains his new company Humansand's mission to build models that understand human goals and collaborate effectively rather than replace people. → KEY INSIGHTS - **STaR Algorithm Scaling:** The Self-Taught Reasoner trains models by having them generate solutions iteratively, learning only from correct answers while progressively solving harder problems. N-digit multiplication experiments showed no obvious plateau as training iterations increased, suggesting genuine scalability in reasoning capabilities. - **Model Intelligence Gaps:** Current models excel at closed-form verifiable problems like physics or math when given proper context, but fail at understanding long-term implications of their responses. They treat each conversation turn as independent, never asking clarifying questions or expressing uncertainty about user goals. - **Task-Centric Training Limitations:** Benchmarks focus on single-task performance for credit assignment between teams rather than measuring how models affect people's lives over time. This paradigm prevents models from learning memory, proactive behavior, or understanding how individual requests fit into broader user contexts and objectives. - **Human-AI Collaboration Advantage:** Models that understand individual goals and coordinate with large groups will likely solve fundamental problems faster than autonomous AI working alone for extended periods. Empowering people to pursue their passions grows economic potential rather than simply replacing existing GDP segments with automation. → NOTABLE MOMENT Zelikman reveals that Google researchers explained task-centric benchmarks persist partly because they enable resource allocation between teams based on percentage improvements, not because they measure what actually matters for helping users accomplish meaningful goals over time. 💼 SPONSORS None detected 🏷️ Reinforcement Learning, AI Reasoning, Human-AI Collaboration, Model Training