Tag: ai-safety

Summaries

Claude Mythos Hits 77.8% SWE-Bench But Stays Gated

KodeKloud

Apr 20, 2026

Claude Mythos Hits 77.8% SWE-Bench But Stays Gated

Anthropic's Claude Mythos scores 77.8% on SWE-Bench Pro (vs Opus 4.6's 53.4%), finds software vulns like a 27-year-old OpenBSD flaw faster than humans, prompting limited Project Glasswing access to aid patching over public release.

__oneoff__

DeepMind's Frontier Safety Framework v3 for AI Risks

DeepMind defines Critical Capability Levels (CCLs) for frontier AI models in misuse (CBRN/cyber/manipulation), ML R&D, and misalignment risks, with protocols for detection, tiered mitigations, and risk acceptance criteria to enable safe deployment.