OpenAI's Safe Open-Weight OSS Models for Agents
gpt-oss-120b and 20b are Apache 2.0 open-weight models excelling in agentic workflows with tool use, CoT reasoning, and adjustable effort; safety evals show no high-risk capabilities even after adversarial fine-tuning.
Agentic Capabilities Tailored for Production Workflows
Use gpt-oss-120b (120B params) or gpt-oss-20b (20B params) for building agents that handle instruction following, tool integration like web search and Python execution, full chain-of-thought reasoning, and structured outputs. These text-only models match OpenAI's Responses API compatibility, allowing seamless drop-in for agentic systems. Customize them freely under Apache 2.0 plus gpt-oss usage policy, with community feedback shaping their design. Adjust reasoning effort dynamically—dial it down for simple tasks to save compute—making them efficient for real-world pipelines where overkill reasoning wastes resources.
In practice, integrate into workflows needing reliable tool calling and CoT without proprietary lock-in; they're built to default to OpenAI safety policies but expect you to layer system-level guards for production.
Safety Evals Prove Low Risk Profile for Open Release
Open models carry unique risks: attackers can fine-tune to evade refusals or optimize for harm, unlike API-served models where OpenAI controls mitigations. This model card details evals using OpenAI's Preparedness Framework across Biological/Chemical, Cyber, and AI Self-Improvement categories—default gpt-oss-120b stays below 'High' capability thresholds in all.
To stress-test, OpenAI's Safety Advisory Group adversarially fine-tuned gpt-oss-120b with their top training stack targeting Bio/Chem and Cyber risks: it still didn't hit 'High' levels. Frontier check: fine-tuned performance doesn't exceed existing open models on most bio evals, so no advancement of open bio capabilities. Developers must add safeguards to match API-level protections, as stakeholders control downstream systems.
Releases reaffirm OpenAI's push for ecosystem safety standards; read full details at arXiv for eval methodologies.