GLM 5.2: why I’m replacing Opus in Claude Code with this new model
Episode
27 min
Read time
2 min
Topics
Productivity, Fundraising & VC, Design & UX
AI-Generated Summary
Key Takeaways
- ✓Model Setup via OpenRouter: To run GLM 5.2 in Cursor, add your OpenRouter API key to the OpenAI key field, override the base URL with `openrouter.ai/api/v1/cursor` (the `/cursor` suffix is undocumented but required), then add `z-ai/glm-5.2` as a custom model. Claude Code requires editing `~/.zshrc` and `~/.claude/settings.json` to reroute all model calls.
- ✓Cost Efficiency at Scale: A 45-minute autonomous coding session consuming roughly 6 million tokens cost $3.36 on OpenRouter, with a 72% cache hit rate. Comparable tasks using Claude Opus 4.8 or GPT-4.5 would cost significantly more. For high-volume coding workflows, switching to GLM 5.2 via a third-party inference provider can reduce API spend substantially.
- ✓Benchmark Positioning: On SWE-Bench Pro, GLM 5.2 scores near GPT-4.5 and approaches Claude Opus 4.8, while outperforming Gemini 2.1 Pro. This places it firmly in frontier-model territory for coding tasks, making it a credible drop-in replacement for expensive proprietary models in agentic software engineering pipelines.
- ✓Agentic Task Performance: GLM 5.2 successfully ran a 45-minute autonomous session pulling 72 hours of Sentry errors and Vercel logs, generating a prioritized bug-fix plan with 14 fixes, 2 P0 issues, and suggested sequencing. It struggled with React/TypeScript mid-session but self-corrected, indicating it handles long-horizon tasks with occasional intervention needed.
- ✓Model Constraints to Know: GLM 5.2 is text-only — no image input or output — which limits multimodal workflows. It supports a 1-million-token context window, function calling, MCP tool use, structured output, streaming, and reasoning/thinking mode. For pure coding and text-based agentic tasks, these constraints rarely surface as blockers in practice.
What It Covers
GLM 5.2, an open-weight model from Beijing-based Z.ai, is tested as a replacement for Claude Opus 4.8 inside Claude Code and Cursor. The episode benchmarks its coding, design, and autonomous agent capabilities against frontier models, with total API costs tracked at $3.36 for 6 million tokens via OpenRouter.
Key Questions Answered
- •Model Setup via OpenRouter: To run GLM 5.2 in Cursor, add your OpenRouter API key to the OpenAI key field, override the base URL with `openrouter.ai/api/v1/cursor` (the `/cursor` suffix is undocumented but required), then add `z-ai/glm-5.2` as a custom model. Claude Code requires editing `~/.zshrc` and `~/.claude/settings.json` to reroute all model calls.
- •Cost Efficiency at Scale: A 45-minute autonomous coding session consuming roughly 6 million tokens cost $3.36 on OpenRouter, with a 72% cache hit rate. Comparable tasks using Claude Opus 4.8 or GPT-4.5 would cost significantly more. For high-volume coding workflows, switching to GLM 5.2 via a third-party inference provider can reduce API spend substantially.
- •Benchmark Positioning: On SWE-Bench Pro, GLM 5.2 scores near GPT-4.5 and approaches Claude Opus 4.8, while outperforming Gemini 2.1 Pro. This places it firmly in frontier-model territory for coding tasks, making it a credible drop-in replacement for expensive proprietary models in agentic software engineering pipelines.
- •Agentic Task Performance: GLM 5.2 successfully ran a 45-minute autonomous session pulling 72 hours of Sentry errors and Vercel logs, generating a prioritized bug-fix plan with 14 fixes, 2 P0 issues, and suggested sequencing. It struggled with React/TypeScript mid-session but self-corrected, indicating it handles long-horizon tasks with occasional intervention needed.
- •Model Constraints to Know: GLM 5.2 is text-only — no image input or output — which limits multimodal workflows. It supports a 1-million-token context window, function calling, MCP tool use, structured output, streaming, and reasoning/thinking mode. For pure coding and text-based agentic tasks, these constraints rarely surface as blockers in practice.
Notable Moment
During a live autonomous run, the model stalled repeatedly while writing React and TypeScript, prompting frustration. Then, without intervention beyond a verbal complaint to the recording, it recovered, compiled cleanly, and delivered a well-structured dark-mode bug prioritization dashboard — reversing the apparent failure entirely.
You just read a 3-minute summary of a 24-minute episode.
Get How I AI summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from How I AI
How Claude Mythos found a 15-year-old bug in Mozilla Firefox | Brian Grinstead
Jun 22 · 48 min
The AI Breakdown
Why AI Users Are Raving About GLM 5.2
Jun 22
More from How I AI
How to design AI agent loops: schedules, goals, and subagents in Claude Code and Codex
Jun 17 · 29 min
All-In with Chamath, Jason, Sacks & Friedberg
OpenAI Misses Targets, Codex vs Claude, Elon vs Sam Trial, Big Hyperscaler Beats, Peptide Craze
May 1
More from How I AI
We summarize every new episode. Want them in your inbox?
How Claude Mythos found a 15-year-old bug in Mozilla Firefox | Brian Grinstead
How to design AI agent loops: schedules, goals, and subagents in Claude Code and Codex
How Braintrust uses AI agents, evals, and CI to ship better software | Ankur Goyal
Claude Fable 5 review: what the new Mythos model gets right (and very wrong)
Shopping with Claude: How to find quality brands, automate returns, and buy things that last 100 years | Nicole Ruiz
Similar Episodes
Related episodes from other podcasts
The AI Breakdown
Jun 22
Why AI Users Are Raving About GLM 5.2
All-In with Chamath, Jason, Sacks & Friedberg
May 1
OpenAI Misses Targets, Codex vs Claude, Elon vs Sam Trial, Big Hyperscaler Beats, Peptide Craze
Moonshots with Peter Diamandis
Feb 9
Opus 4.6 Tops Benchmarks, ChatGPT Market Share Decline, and the Privacy Breakdown | EP 228
The Startup Ideas Podcast
Feb 6
Claude Opus 4.6 vs GPT-5.3 Codex: Live Build, Clear Winner
The AI Breakdown
Feb 6
Opus 4.6 and ChatGPT 5.3-Codex Are Here and the Labs Are at War
Explore Related Topics
This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.
You're clearly into How I AI.
Every Monday, we deliver AI summaries of the latest episodes from How I AI and 192+ other podcasts. Free for one show.
Start My Monday DigestNo credit card · Unsubscribe anytime