Autonomous Engineering for Production Systems

NVIDIA's coding agents team defaults to Codex powered by GPT-5.5 for complex tasks because it handles long, autonomous sessions with context retention via compactions, tactically selects tools, and surfaces bugs or gaps missed by prior models. This evolves MVPs into scalable, reliable production systems—something earlier models couldn't achieve reliably. For instance, they built an internal podcast recording app (like Riverside) in hours under privacy constraints that would have taken weeks via procurement; the Codex desktop app autonomously tested video/audio functionality during development, lowering the bar for what prototypes are worth pursuing.

Full ML Research Loops from Laptop

Codex automates end-to-end research: feed it paper corpora (e.g., reinforcement learning), and GPT-5.5 identifies hypotheses, traces evidence chains, and generates knowledge graphs to visualize concept links—proving more creative than competitors. It then writes training scripts, deploys via SSH to remote NVIDIA infrastructure (no manual login/setup), and executes experiments. This yields 10x speed in research workflows by eliminating manual scripting and orchestration, letting researchers run large ML jobs directly from laptops.

Code Optimization and Scale

For legacy code, point Codex at Python repos; GPT-5.5 translates to Rust for 20x performance gains. With 40k NVIDIA employees accessing it on GB200/GB300 hardware, it accelerates from idea to tested execution in unified flows, changing build thresholds across engineering and research.