Scaling Verified AI Access for Cyber Defenders

Principles for Balancing AI Cyber Capabilities and Risks

OpenAI structures cyber defense around three principles: democratized access via objective KYC and identity verification to avoid arbitrary gatekeeping; iterative deployment by testing models in the world, refining safeguards against jailbreaks, and calibrating refusals; and ecosystem resilience through grants, open-source contributions, and tools like Codex Security. Their convictions emphasize acting now on existing risks—cyber vulnerabilities predated AI, but attackers use test-time compute for stronger capabilities—while tying access to user trust signals rather than model alone. Defenses scale with agentic coding advances: GPT-5.2 added cyber safety training, GPT-5.3-Codex expanded safeguards, and GPT-5.4 hit 'high' cyber capability under the Preparedness Framework. This enables broad general model access alongside granular controls for high-risk uses, automating verification for legitimate defenders protecting critical infrastructure.

Proven Tools Accelerating Defensive Workflows

Codex Security, launched in private beta six months ago and research preview earlier this year, monitors codebases, validates issues, and proposes fixes—contributing to over 3,000 critical/high vulnerabilities fixed ecosystem-wide, plus lower-severity ones. It integrates into dev workflows for continuous feedback, shifting security from audits to real-time risk reduction. Supporting efforts include a $10M Cybersecurity Grant Program, Codex for Open Source reaching 1,000+ projects with free scanning, and contributions like $12.5M to Linux Foundation open-source security. Since 2023, programs like the Cybersecurity Grant and model evaluations have prevented misuse while empowering defenders to find/fix issues faster than attackers, countering dual-use risks in vulnerability discovery and code reasoning.

Accessing Permissive Models Like GPT-5.4-Cyber

Trusted Access for Cyber (TAC), launched in February, now scales to thousands of individuals and hundreds of teams via automated verification at chatgpt.com/cyber for individuals or enterprise requests. Higher tiers unlock GPT-5.4-Cyber—a fine-tuned variant lowering refusal boundaries for legit cyber work, adding binary reverse engineering to analyze compiled software for malware/vulnerabilities without source code. Initial rollout limits to vetted vendors/researchers, with potential Zero-Data Retention constraints for low-visibility uses. Existing TAC users express interest in upgrades. This locks step with upcoming models, ensuring safeguards suffice for broad deployment while permissive cyber variants get stricter controls.