GLM-5 Leads Open-Source in Coding, Reasoning, Agents

GLM-5 scales to 744B params (40B active) and 28.5T tokens, tops open-source benchmarks like SWE-bench (77.8%) and Vending Bench 2 ($4,432 balance), enabling complex engineering and long-horizon agents while cutting deployment costs via DSA.

Scale Pre-Training and RL for Superior Competence

GLM-5 advances from GLM-4.5's 355B params (32B active) and 23T tokens to 744B params (40B active) with 28.5T pre-training tokens, using DeepSeek Sparse Attention (DSA) to maintain long-context capacity at lower deployment costs. Post-training leverages slime, an asynchronous RL infrastructure (github.com/THUDM/slime), to boost throughput for fine-grained iterations, bridging pre-trained competence to excellence without RL's typical inefficiency.

This yields best-in-class open-source results: Humanity's Last Exam (30.5, 50.4 w/tools), AIME 2026 I (92.7%), HMMT Nov 2025 (96.9%), GPQA-Diamond (86.0%). Coding excels on SWE-bench Verified (77.8%), Multilingual (73.3%), Terminal-Bench 2.0 (56.2/60.7 verified), CyberGym (43.2%). Agent benchmarks include BrowseComp (62.0/75.9 w/context mgmt), τ²-Bench (89.7%), MCP-Atlas (67.8%), Tool-Decathlon (39.2), nearing Claude Opus 4.5 and Gemini 3.0 Pro.

On CC-Bench-V2 internal suite, GLM-5 outperforms GLM-4.7 in frontend, backend, long-horizon tasks, closing gap to Claude Opus 4.5. Vending Bench 2 simulates year-long vending machine ops; GLM-5 ends with $4,432 balance (#1 open-source, near Claude's $4,967).

Agentic Engineering for Real Deliverables

GLM-5 shifts from 'vibe coding' to agentic systems engineering, handling complex, multi-turn tasks like generating production-ready .docx/.pdf/.xlsx files (PRDs, spreadsheets, reports) from prompts. Example: Student council football sponsorship proposal includes structured sections (intro, event details, fund use, tiers, benefits), visual elements (image placeholders, tables), color palette (navy/gold), ensuring print-ready output without edits.

Z.ai's Agent mode supports PDF/Word/Excel creation, multi-turn collaboration for deliverables. Outperforms priors on long-horizon planning/resource mgmt, as in Vending Bench 2's operational simulation.

Deploy Anywhere, Integrate Seamlessly

Open-source (MIT) on HuggingFace (zai-org/GLM-5), ModelScope, GitHub (zai-org/GLM-5). API via api.z.ai, BigModel.cn, Z.ai chat/agent modes. Local inference: vLLM/SGLang; non-NVIDIA chips (Ascend, Moore Threads) via quantization.

Coding agents: Claude Code/OpenClaw (update to 'GLM-5', higher quota use), OpenCode/Kilo/Roo/Cline/Droid. Z Code IDE for multi-agent collab. Gradual rollout to GLM Coding Plan subs; free trial on Z.ai.

Summarized by x-ai/grok-4.1-fast via openrouter

7657 input / 1725 output tokens in 8371ms

© 2026 Edge