Glasswing: AI Finds Zero-Days to Secure Critical Software

Mythos Preview's Superior Vulnerability Detection

Claude Mythos Preview, an unreleased frontier model, autonomously identifies thousands of zero-day vulnerabilities—many critical—in every major operating system and web browser, plus tools like FFmpeg and the Linux kernel. Specific examples include a 27-year-old OpenBSD flaw allowing remote crashes on firewalls (patched), a 16-year-old FFmpeg bug missed by 5 million automated tests, and a chained Linux kernel exploit escalating user access to full control. It outperforms Claude Opus 4.6 on CyberGym (83.1% vs 66.6% vulnerability reproduction) and agentic coding benchmarks like SWE-bench Verified (93.9% vs 80.8%), Terminal-Bench 2.0 (82.0% vs 65.4%), and GPQA Diamond (94.6% vs 91.3%). These capabilities stem from advanced agentic coding, reasoning, and search, enabling it to spot flaws surviving decades of human and automated scrutiny, while developing sophisticated exploits.

To counter proliferation risks—where AI lowers expertise barriers for attackers, potentially amplifying $500B annual global cybercrime costs—defenders gain an edge by using the same tools proactively. Partners report it uncovers complex issues prior models missed, accelerating fixes at scale.

Project Glasswing Enables Industry-Wide Defense

Launched with partners including AWS, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks—plus 40+ critical infrastructure orgs—Project Glasswing provides Mythos Preview access for scanning first-party and open-source systems. Focus areas: local vulnerability detection, black-box binary testing, endpoint securing, and penetration testing. Anthropic commits $100M in usage credits (post-preview: $25/$125 per million tokens via Claude API, Bedrock, Vertex AI, Microsoft Foundry) and $4M donations ($2.5M to Alpha-Omega/OpenSSF, $1.5M to Apache). Open-source maintainers apply via Claude for Open Source program.

Partners like Cisco emphasize AI's pace/scale shift demands new hardening; AWS integrates it into 400T daily network flows; Microsoft notes CTI-REALM gains; CrowdStrike warns of collapsing exploit timelines (months to minutes); Linux Foundation sees it as a 'sidekick' for maintainers lacking teams. Google highlights ecosystem tools like Big Sleep/CodeMender. This collaboration shares learnings to harden shared cyber surfaces before adversarial use.

Balancing AI Cyber Risks with Safeguarded Deployment

AI cyber skills rival top humans (echoing DARPA's 2016 Cyber Grand Challenge), risking frequent/destructive attacks on banking, healthcare, energy, transport, and government without safeguards. Yet optimism prevails: Mythos aids bug-free software creation. Anthropic won't release it publicly but plans safeguards in upcoming Claude Opus for safe, scaled deployment in cybersecurity and beyond. Cryptographic hashes disclosed for unpatched vulns; full details post-fix via Frontier Red Team blog.