OpenAI's Codex Security Cuts False Positives 50%+ in Vuln Scans

Codex Security's Vulnerability Detection Workflow

Connect your code repository to Codex Security, and it automatically analyzes the codebase to build a project-specific threat model. It then identifies potential vulnerabilities and tests them in isolated environments to confirm exploitability without risking production systems. This end-to-end process shifts security left in the dev cycle, enabling builders to catch issues early in repos rather than post-deploy.

Formerly Aardvark, it's now in research preview for ChatGPT Enterprise, Business, and Edu users—free for the first month. Start via the documentation at developers.openai.com/codex/security.

Beta Performance: Fewer Alerts, More Actionable Fixes

In beta testing, Codex Security reduced false positives by over 50% compared to prior tools, focusing devs on real threats. One case saw redundant alerts drop 84%, cutting alert fatigue. Over 30 days, it scanned 1.2 million commits across projects and flagged 792 critical vulnerabilities, proving scale for real-world use.

Trade-off: As a preview, expect iteration on edge cases, but metrics show it outperforms traditional scanners on precision for AI-assisted security.

Real-World Impact on Open Source and CVEs

Codex has already reported vulns in major projects: OpenSSH (commit c991273c18afc490313a9f282383eaf59d9c13b9), GnuTLS (gnutls-help mailing list), GOGS (GHSA-p6x6-9mx6-26wj), Thorium (CVE-2025-35430), and Chromium. This led to 14 CVEs issued so far.

OpenAI is expanding a program for open-source maintainers (openai.com/form/codex-for-oss), making it free for OSS projects to integrate proactive scanning. For indie builders or small teams, this means production-grade vuln detection without hiring security experts—pair it with your CI/CD for automated pulls.