GLM-5 Leads Open-Source in Coding, Reasoning, Agents
GLM-5 scales to 744B params (40B active) and 28.5T tokens, tops open-source benchmarks like SWE-bench (77.8%) and Vending Bench 2 ($4,432 balance), enabling complex engineering and long-horizon agents while cutting deployment costs via DSA.
Scale Pre-Training and RL for Superior Competence
GLM-5 advances from GLM-4.5's 355B params (32B active) and 23T tokens to 744B params (40B active) with 28.5T pre-training tokens, using DeepSeek Sparse Attention (DSA) to maintain long-context capacity at lower deployment costs. Post-training leverages slime, an asynchronous RL infrastructure (github.com/THUDM/slime), to boost throughput for fine-grained iterations, bridging pre-trained competence to excellence without RL's typical inefficiency.
This yields best-in-class open-source results: Humanity's Last Exam (30.5, 50.4 w/tools), AIME 2026 I (92.7%), HMMT Nov 2025 (96.9%), GPQA-Diamond (86.0%). Coding excels on SWE-bench Verified (77.8%), Multilingual (73.3%), Terminal-Bench 2.0 (56.2/60.7 verified), CyberGym (43.2%). Agent benchmarks include BrowseComp (62.0/75.9 w/context mgmt), τ²-Bench (89.7%), MCP-Atlas (67.8%), Tool-Decathlon (39.2), nearing Claude Opus 4.5 and Gemini 3.0 Pro.
On CC-Bench-V2 internal suite, GLM-5 outperforms GLM-4.7 in frontend, backend, long-horizon tasks, closing gap to Claude Opus 4.5. Vending Bench 2 simulates year-long vending machine ops; GLM-5 ends with $4,432 balance (#1 open-source, near Claude's $4,967).
Agentic Engineering for Real Deliverables
GLM-5 shifts from 'vibe coding' to agentic systems engineering, handling complex, multi-turn tasks like generating production-ready .docx/.pdf/.xlsx files (PRDs, spreadsheets, reports) from prompts. Example: Student council football sponsorship proposal includes structured sections (intro, event details, fund use, tiers, benefits), visual elements (image placeholders, tables), color palette (navy/gold), ensuring print-ready output without edits.
Z.ai's Agent mode supports PDF/Word/Excel creation, multi-turn collaboration for deliverables. Outperforms priors on long-horizon planning/resource mgmt, as in Vending Bench 2's operational simulation.
Deploy Anywhere, Integrate Seamlessly
Open-source (MIT) on HuggingFace (zai-org/GLM-5), ModelScope, GitHub (zai-org/GLM-5). API via api.z.ai, BigModel.cn, Z.ai chat/agent modes. Local inference: vLLM/SGLang; non-NVIDIA chips (Ascend, Moore Threads) via quantization.
Coding agents: Claude Code/OpenClaw (update to 'GLM-5', higher quota use), OpenCode/Kilo/Roo/Cline/Droid. Z Code IDE for multi-agent collab. Gradual rollout to GLM Coding Plan subs; free trial on Z.ai.