Nick Frosst

#321 Nick Frosst: Why Cohere Is Betting on Enterprise AI, Not AGI

Feb 17, 202661 minCofounder of Cohere

AI Summary

→ WHAT IT COVERS Nick Frosst, Cohere cofounder and former Google Brain researcher under Geoffrey Hinton, explains why Cohere focuses on enterprise AI rather than AGI. He discusses building capital-efficient models requiring only two GPUs versus 16-plus for competitors, achieving 95% production deployment versus industry's 5%, and why transformer architectures remain dominant despite alternatives like capsule networks and neuroevolution approaches. → KEY INSIGHTS - **Enterprise deployment efficiency:** Cohere's Command R reasoning model runs on two GPUs compared to 16 GPUs for DeepSeek and more for other competitors, enabling private deployment in customer environments including on-premise servers and virtual private clouds. This capital efficiency allows regulated industries like Royal Bank of Canada to use AI on proprietary data without sending information externally, solving the fundamental problem that most valuable enterprise data cannot legally or strategically leave company infrastructure. - **Production versus demo gap:** MIT research shows 95% of AI applications remain in demo phase and never reach production, but Cohere reports the inverse ratio with vast majority of deployments in production use. This reversal stems from focusing on cost-effective models that provide clear ROI rather than flashy consumer features. Companies abandon pilots when inference costs exceed value delivered, making efficiency the critical factor for enterprise adoption beyond proof-of-concept stages. - **AGI skepticism framework:** Frosst argues transformers represent artificial intelligence similar to how planes achieve artificial flight, fundamentally different from biological intelligence rather than replicating it. Planes cannot hover like hummingbirds or match albatross efficiency, yet carry massive weight and speed. Similarly, LLMs excel at document summarization and tool chaining but cannot understand cultural nuance or work autonomously like humans, making AGI through scaling transformers unlikely despite continued improvements in specific capabilities. - **Evaluation methodology failure:** Standard benchmarks like ARC AGI test pixel-matching reasoning games that no actual job requires, making them poor predictors of enterprise utility. Cohere recommends companies create 10-20 examples of their specific use cases and test models directly on those tasks rather than relying on academic benchmarks. This targeted evaluation approach aligns model selection with actual deployment needs, whether summarizing weekly emails for executive reports or analyzing quarterly earnings across multiple data sources. - **Model customization without consumer features:** Cohere trains foundational models from scratch on open web data, then refines them for enterprise reasoning, multimodal document analysis, and tool use while deliberately excluding image generation capabilities. This focus saves parameters and reduces model size while improving performance on business-critical tasks like parsing technical schematics, cross-referencing multiple data sources, and executing complex tool chains. The approach prioritizes ROI-generating capabilities over consumer engagement features that rarely justify costs in business contexts. - **Agentic workflow architecture:** Cohere defines agentic systems as models that receive prompts, call multiple tools like search or code execution, then iteratively call additional tools based on results until finding answers rather than responding immediately. This loop enables complex tasks like analyzing emails, Slack messages, and Salesforce data to identify high-potential customers currently receiving minimal attention. The framework proves particularly valuable for knowledge workers processing information across disparate systems, though Frosst rejects the notion of autonomous agent societies as conflating LLMs with AGI. → NOTABLE MOMENT Frosst reveals his technical disagreement with former mentor Geoffrey Hinton centers on whether neural networks constitute a sufficient component for AGI or merely a necessary but insufficient one. While respecting Hinton's focus on long-term governance as the field's inventor, Frosst maintains that Hinton's public warnings about existential AI threats confuse the public about timescales and feasibility rather than productively informing regulators and researchers about actual near-term challenges. 💼 SPONSORS [{"name": "Tastytrade", "url": "https://tastytrade.com"}] 🏷️ Enterprise AI, Transformer Architecture, Model Efficiency, AGI Debate, Agentic Systems, Private Deployment

Read Full Summary Listen

20VC: Cohere Founder on How Cohere Compete with OpenAI and Anthropic $BNs | Why Counties Should Fund Their Own Models & the Need for Model Sovereignty | How Sam Altman Has Done a Disservice to AI with Nick Frosst

20VC (20 Minute VC)

Sep 1, 202568 min

AI Summary

→ WHAT IT COVERS Nick Frosst, Cohere cofounder, discusses competing against OpenAI and Anthropic with enterprise-focused models, challenges Sam Altman's AGI predictions, explains why scaling laws have limits, and advocates for sovereign AI models and forward-deployed engineering approaches. → KEY INSIGHTS - **Enterprise Model Differentiation:** Cohere trains models specifically for enterprise tool use and business data integration rather than consumer engagement metrics, using synthetic data from fake companies, emails, and APIs to optimize workplace augmentation over conversational ability or entertainment value. - **Scaling Law Limitations:** Throwing more compute at models does not guarantee exponential progress, as evidenced by GPT-5's worse user experience with auto-selection delays. The industry still uses 2017 transformer architecture with minimal algorithmic changes, making data quality and product work more critical than raw compute power. - **Efficient Model Training:** Cohere trains models to fit on two GPUs, spending orders of magnitude less than competitors on foundational models. This efficiency addresses enterprise deployment bottlenecks where companies lack GPU access, making the sweet spot between performance, cost, and available infrastructure crucial for production deployment. - **Benchmark Gaming Reality:** Industry benchmarks like HellaSWAG and ARC AGI challenge do not reflect actual enterprise utility. Models can be trained specifically to perform well on benchmarks without improving real workplace value. Customer success depends on practical task completion, not leaderboard rankings or mathematical reasoning tests. - **Forward-Deployed Engineering Value:** Enterprise AI deployment requires forward-deployed engineers to customize models for specific business contexts, internal tools, and documentation. This approach is not poor technology but necessary infrastructure work, similar to how industrial revolution required labor policy alongside technological advancement to create sustainable productivity gains. → NOTABLE MOMENT Frosst directly criticizes Sam Altman for making obviously wrong predictions about AGI timelines and existential threats, arguing this world tour warning global leaders was academically disingenuous and damaged productive discourse about real AI risks like income inequality and workforce disruption rather than imagined digital gods. 💼 SPONSORS [{"name": "Coda", "url": "https://coda.io/20vc"}, {"name": "Brex", "url": "https://brex.com/startups"}, {"name": "Vanta", "url": "https://vanta.com/20vc"}] 🏷️ Enterprise AI, Model Sovereignty, Scaling Laws, AI Benchmarks, Workforce Automation

Read Full Summary Listen

Featured On 2 Podcasts

Eye on AI

20VC (20 Minute VC)

All Appearances

#321 Nick Frosst: Why Cohere Is Betting on Enterprise AI, Not AGI

AI Summary

20VC: Cohere Founder on How Cohere Compete with OpenAI and Anthropic $BNs | Why Counties Should Fund Their Own Models & the Need for Model Sovereignty | How Sam Altman Has Done a Disservice to AI with Nick Frosst

AI Summary

Explore More

Never miss Nick Frosst's insights