KodeKloud
Claude Mythos Hits 77.8% SWE-Bench But Stays Gated
Anthropic's Claude Mythos scores 77.8% on SWE-Bench Pro (vs Opus 4.6's 53.4%), finds software vulns like a 27-year-old OpenBSD flaw faster than humans, prompting limited Project Glasswing access to aid patching over public release.
