Claude Mythos Tops Benchmarks But Stays Locked for Security

Video description

Anthropic has revealed Claude Mythos Preview — a new frontier model it's calling too powerful for public release. Instead, it's being made available exclusively to a select group of partners including Apple, Google, Microsoft, and NVIDIA under an initiative called Project Glasswing. We also cover Meta's internal "Claudeonomics" leaderboard turning token usage into office status, new data on GitHub commits exploding 14x year-on-year, Perplexity's ARR surging past $450M, and Google's Product Director making the case that Go-to-Market is becoming the essential skill in the AI age. ➡️ Subscribe for weekly product briefings and more analysis: https://departmentofproduct.substack.com Follow on Substack Notes: https://substack.com/@richholmes 🔗LINKS Project Glasswing announcement — https://www.anthropic.com/glasswing Claude Mythos Preview system card — https://www-cdn.anthropic.com/8b8380204f74670be75e81c820ca8dda846ab289.pdf Felix Rieseberg on Mythos being a "step function change" — https://x.com/felixrieseberg/status/2041586309966524919 Simon Willison on why the pause "sounds necessary" — https://simonwillison.net/2026/Apr/7/project-glasswing/ Ethan Mollick on security risks — https://x.com/emollick/status/2041578945531830695 Meta's internal AI token leaderboard — https://www.theinformation.com/articles/meta-employees-vie-ai-token-legend-status?rc=77sebk Jensen Huang on token spending — https://embed.businessinsider.com/jensen-huang-500k-engineers-250k-ai-tokens-nvidia-compute-2026-3 Zapier's AI fluency framework — https://x.com/wadefoster/status/2038979630590509553 Linear's COO on token-maxxing — https://x.com/cjc/status/2041299419845599489 Google's Product Director on GTM as the essential skill — https://x.com/jacalulu/status/2041160452672004189 The SaaS chat bar trend — https://x.com/rabi_guha/status/2040082295563169852 Simon Willison on GitHub commits — https://simonwillison.net/2026/Apr/4/kyle-daigle/ Ramp: monthly AI spend grew 4x — https://ramp.com/3-steps-to-manage-ai-spend Perplexity ARR tops $450M — https://ca.finance.yahoo.com/news/perplexity-arr-tops-450m-pricing-132500539.html AI and software engineering jobs — https://www.businessinsider.com/ai-isnt-killing-software-coding-jobs-booming-trueup-2026-4 Substack article on new product development processes - https://departmentofproduct.substack.com/p/the-new-product-development-operating

Mythos Preview's Coding Prowess Sparks Security Lockdown

Claude Mythos Preview achieves 93.9% on SWE-bench verify (vs. 80.8% Claude Opus 4.6, 80.6% Gemini 3.1 Pro) and 77.8% on tougher SWE-bench Pro (24-point lead over GPT 5.4/Opus 4.5). This enables finding thousands of zero-days across OSes/browsers, including a 27-year-old OpenBSD remote crash flaw, 16-year-old FFmpeg bug missed by 5M tests, and Linux privilege escalation. Anthropic's $100M-token Project Glasswing limits access to Apple, Google, Microsoft, NVIDIA for defensive patching, prioritizing safety over public release—experts like Simon Willison call the pause necessary, Ethan Mollick predicts more such restrictions. Product teams gain a prompt to audit codebases aggressively, but expect accelerated AI adoption once widened, elevating security audits for CTOs.

Token Maxing Rewards High AI Spend for Efficiency Gains

Meta's Claudonomics leaderboard ranks 85K employees by token use, awarding 'token legend'/'session immortal' badges to top burners, turning consumption into prestige. Nvidia's Jensen Huang flags alarm if $500K engineers don't burn $250K tokens yearly, as upfront AI investment cuts long-term costs. Zapier measures hires on token use/AI fluency; Linear COO critiques it like ranking marketers by spend. Use token-maxing to justify AI budgets—track ROI via saved dev time—but pair with output metrics to avoid waste, as Mythos could spike usage further.

GTM and Generative UI Define AI Product Winners

Google Product Director argues AI eases building, shifting focus to 'should you build?' and vertical-specific GTM: tailor landing pages, onboarding, defaults, suggestions via generative AI for personalized experiences. SaaS trend: chat bars (Linear, PostHog, Tier) replace static homepages, admitting one-size-fits-all UIs fail diverse users—next: agents composing interfaces. Builders prioritize GTM roadmaps with AI personalization to cut acquisition costs 2-3x over generic funnels.

AI Fuels 14x GitHub Activity, $450M Perplexity Surge

GitHub commits hit 275M/week (14x YoY, on pace for 14B yearly vs. 1B in 2025); AI PRs 4x to 17M in 6 months; Claude commits 25x to 2.5M/week. Ramp data: AI spend 4x YoY, 15% of software budgets. Perplexity ARR jumps to $450M+ (from $305M) via 'computer' feature orchestrating models for projects. Despite 52K Q1 layoffs (AI-linked), 67K software jobs open (+30% YoY, highest in 3+ years). Ship faster by integrating agents into repos—Perplexity proves multi-model coordination drives PMF at scale.

Video description

Mythos Preview's Coding Prowess Sparks Security Lockdown

Token Maxing Rewards High AI Spend for Efficiency Gains

GTM and Generative UI Define AI Product Winners

AI Fuels 14x GitHub Activity, $450M Perplexity Surge

More from AI News & Trends

Claude Handles PM Docs: Roadmap to 100 Tickets in Minutes

World Models Degrade Decisions Without Judgment Boundaries

Claude Builds Real Business Plans to Drive Products

Anthropic Leaks 500K Lines of Claude Code Logic