Anthropic's Claude Code Limits: GPU Crunch Exposed

Peak-Hour Squeeze: From Generous Subs to Throttled Sessions

Claude Code subscriptions were historically overprovisioned—users on the $200/month Max plan could burn through $5,000 in compute, a 25x subsidy that competitors envied. This generosity masked underlying strain, especially during weekdays 5-11 a.m. PT, matching prior peak issues like the March discount window. To manage demand, Anthropic accelerated depletion of 5-hour session limits for free, Pro, and Max tiers during these hours, leaving weekly totals unchanged but hitting ~7% of Pro users with new caps they wouldn't have faced before.

Users erupted in backlash, from pleas to "fucking bullshit" replies, as sessions ended prematurely without warning. Theo defends the move: it's not malice but economic reality. Off-peak doubled to 2x usage initially, but peaks overwhelmed capacity. Tradeoff: Protect high-value enterprise slots without blanket cuts, at the cost of sub churn risk.

"According to some of the competitors doing analysis you're able to consume up to $5,000 of compute per month when paying for the $200 a month plan that is a 25x subsidization that is crazy." (Theo on historical subsidy, highlighting unsustainable pricing.)

Compute Crisis: Fixed GPUs vs. Triple Demand

Core problem: GPUs are finite, powering inference for users, product builds (Claude.ai, Code, CLI, agents), and research training. Anthropic's 2024 revenue hit $100M, 2025 $5B, projected 2026 $14B. Enterprise exploded—$100K+ ARR customers up 7x yearly; dozen $1M+ spenders two years ago now exceed 500, including 8/10 Fortune 10 firms.

With fixed clusters (e.g., hypothetical 100 GPUs), allocation pits research (long-term model bets), product (token-heavy features like code review agents), and users against each other. Elastic internal shifts work short-term—e.g., Sonnet 4 launch cut T3's rate limits elsewhere—but growth eroded buffers. Research allegedly trains a "god model" (Mythos leaks), demanding max compute; products evolve to higher-token ops; users scale via subs/API.

Anthropic lagged GPU buys, unlike OpenAI's "buy everything" frenzy. Lead times: 18-36 months. Partnerships fill gaps—Amazon collab, Google-financed data centers—but capacity is shared, not owned. Recent H100 ramp slightly overtook OpenAI, xAI's $44B farm looms massive. Result: Contention spikes peaks, idling off-peak (e.g., 20 parallel 7-11 a.m. users need 20 GPUs; 4 p.m. frees doubles).

Efficiency hunts compound issues: Google's turboquant cuts memory; Anthropic's late-2024 tweaks caused model degradation. Uptime slipped to 98% claimed (monitors say 95%), with outages like Feb 26's 6-hour reporting blackout, Feb 27's 4-hour login fail.

"Anthropic did not choose to invest heavily in buying more compute we've been trying to get good data here... it does actually seem like as of recently Anthropic has significantly increased their amount of compute... that said this is not owned by Anthropic this is owned by Amazon." (Theo on procurement failures and partner reliance, explaining capacity shortfalls.)

Prioritization Logic: Enterprise Trumps Subs

Value-based allocation: Enterprise/API yields direct revenue (usage scales pay); subs guarantee flat $200/month regardless of $0-$5K burn (retention play); research is zero-short-term, billions-long-term bet. During peaks, throttle subs to reserve for enterprises at risk of churn—mirroring internal cuts employees endure.

Research-led DNA hurts: Founders/execs prioritize models over products (original Claude.ai outsourced). Users now "take the hit" like teams do. OpenAI sunset Sora/video API for similar reallocations—video gen's compute outweighed revenue.

Off-peak pricing incentives emerge: Lower contention = cheaper effective cost. Experiments shift loads, but bugs (pre-announce hits, unconfirmed caching issues) fueled conspiracies, separate from source leaks.

"If you give enterprise and API users more access they will spend more money and you'll make more money... subscriptions are a financial gain but you're just trying to keep them from churning and researchers are a long-term financial investment." (Theo reframing GPU splits by revenue impact, revealing sub deprioritization.)

Communication Breakdown Amplifies Backlash

Execution botched: Limits tightened pre-announcement (users hit caps day before). Pthoric/Thoric tweeted at 12:45 p.m. PT—post-window, non-official account, Twitter-only. No dashboard/CLI/Claude.ai notice. Contrast: Spring break 2x off-peak boost linked officially + tweeted.

Optics fail: Heavy users miss info sans social. No rate reset post-bugs; vague "faster depletion" sans quant. Internal siloed—Pthoric solo-handles fallout.

"This post not only was it two-ish hours late from when the window ended it's also only on Twitter and it's not even from an official Anthropic account it's from Thoric who's an individual employee." (Theo on announcement sins, underscoring product-user disconnect.)

Key Takeaways

Monitor peak 5-11 a.m. PT weekdays; shift workflows off-peak for 2x Claude Code usage.
Subs like Claude Code subsidize heavily (25x compute value)—expect more throttling as growth strains free tiers.
Prioritize enterprise in AI providers: High-usage API pays; flat subs get squeezed first.
GPU lead times demand foresight—stockpile now or partner (Anthropic's path risks dependency).
Demand in-product comms: Ignore Twitter-only changes; check dashboards/CLIs for subs.
Efficiency > scale: Watch quantization/compression (e.g., Google's turboquant) to stretch limits.
Research-led firms pivot poorly to product—watch for researcher favoritism in limits.
Uptime metrics matter: Anthropic's 95-98% lags; GitHub Copilot at 90%—benchmark providers.
Revenue signals health: Anthropic's $14B 2026 projection justifies pain, but test competitors.

"Everyone's fighting over GPUs there aren't enough GPUs they're trying to allocate things to the best of their ability." (Theo summarizing industry-wide crisis, predicting more restrictions ahead.)