Qwen3-Coder-Next: Efficient Agentic Coding Model

Hybrid Architecture Delivers Top Agentic Performance

Qwen3-Coder-Next uses Qwen3-Next-80B-A3B-Base's hybrid attention and MoE for strong coding and agentic abilities via scalable training on executable tasks, environment interaction, and RL. Available sizes include Qwen3-Coder-Next (instruct/base), Qwen3-Coder-480B-A35B-Instruct, and Qwen3-Coder-30B-A3B-Instruct, all with 256K native context (extendable to 1M via Yarn). FP8 and GGUF variants reduce inference costs while rivaling Claude Sonnet on agentic coding, browser-use, and foundational tasks. Supports platforms like Qwen Code, CLINE, Claude Code with custom function calling via SGLang/vLLM tool parser; updated tokenizer ensures Qwen3 consistency, dropping blocks.

Chat and Code Generation Quickstarts

Load via transformers: use from_pretrained on tokenizer/model, apply_chat_template with add_generation_prompt for ChatML format (<|im_start|>assistant\n), generate with max_new_tokens, then batch_decode. Instruct models handle chatting directly. For fill-in-the-middle (FIM), prefix <FIM_PREFIX>, suffix <FIM_SUFFIX>, middle <FIM_MIDDLE> per arXiv:2207.14255—supported across all Qwen3-Coder versions for code insertion in context gaps.

Demos Showcase Autonomous Agents

Qwen3-Coder-Next builds full websites (e.g., Qwen history page deployed via Alibaba Cloud Nginx), tidies desktops via environment interaction, implements reverse-tower-defense game Zombies vs. Plants (5x9 grid, 120s timer, zombie types at 50-150 brain cost, plants with HP/damage stats, collision/particle effects), creates sound ASCII art drawers (mouse/touch, pattern switcher, harmonious notes), vibe-checks sites by clicking/reporting bugs, and renders 800-1200 particle systems (cursor forces, FPS counter, requestAnimationFrame). Videos prove end-to-end execution from natural language prompts.