Nemotron 3 Super Delivers Frontier Agentic Capabilities at Lower Inference Cost

Nemotron 3 Super is an open mixture-of-experts (MoE) hybrid Mamba-Transformer model with 120 billion total parameters but only 12 billion active per token, optimized for agentic reasoning, coding, tool use, terminal workflows, and long-context tasks. It matches or beats other open frontier models while achieving up to 2.2x higher throughput than GPT-OSS 120B and 7.5x higher than Qwen 3 52B in Nvidia's tests, making it viable for production without prohibitive costs. Weights, training recipes, and post-training data are fully open, enabling customization. For coding agents, force non-MPY content in requests to avoid tool-calling issues with empty assistant messages; enable reasoning by default for planning/debugging but disable for faster simple edits.

Plug into OpenAI-Compatible Tools for Repo and Workflow Tasks

Access Nemotron 3 Super via Nvidia Build's free-to-try API at build.nvidia.com/nvidia/nemotron-3-super-120b-a12b (base URL: integrate.nvidia.com/v1). Its OpenAI-compatible endpoint integrates directly into Kilo CLI, OpenCode, Roo, Cline, or custom scripts—run /connect in Kilo/OpenCode, select Nvidia provider, add API key, and pick the model. Use it for repo planning (inspect codebase, generate implementation plans), code review/debugging (trace logs, triage bugs), workflow automation (CI bots, internal tools), and terminal-heavy tasks like shell output reasoning. OpenCode offers fully free trials without keys, pairing well with the model's snappiness for repo exploration and command loops. Avoid for lightweight autocomplete; it's built for complex agentic work.

Nvidia's GTC 2026 Stack Enables Open Long-Running Agents

Nemotron 3 Super anchors Nvidia's open agentic AI push, announced at GTC 2026: Vera Rubin platform integrates Vera CPU, Rubin GPU, NVLink 6, and Grok LPU for end-to-end AI factories (pre-training to agentic inference); Dynamo 1.0 open-source inference OS boosts Blackwell performance up to 7x via routing, scheduling, and economics; NemoClaw installs Nemotron models and OpenShell runtime in one command for secure, always-on agents from cloud to RTX PCs/DGX. Expanded families target agentic (Nemotron), physical (Cosmos), robotics (Isaac Groot), driving (Alpamo), and science (BioNemo) AI. Nemotron Coalition with Cursor, LangChain, Mistral, Perplexity, and others builds Nemotron 4, providing open alternatives to closed ecosystems for flexible, cost-effective coding agents.