3 Bottlenecks to AI Compute: Logic, Memory, Power
Hyperscalers' $600B CapEx funds multi-year compute ramps to 20GW/year; labs like OpenAI/Anthropic need 5GW+ for inference growth. Key limits: ASML/TSMC logic, HBM memory crunch, but US power scales easily.
Hyperscalers' CapEx Funds Multi-Year Compute Ramps
Dylan Patel breaks down the $600 billion combined CapEx from Amazon, Meta, Google, and Microsoft, equating to roughly 50 gigawatts in rental value at current prices. This isn't all deploying in 2025—much covers prior-year spends and future builds. For instance, Google's $180 billion includes turbine deposits for 2028-2029, data center construction for 2027, and power purchase agreement down payments. Across the supply chain, total spend hits a trillion dollars, enabling 20 gigawatts of incremental US capacity this year, split among hyperscalers and AI labs like OpenAI and Anthropic as top customers.
Anthropic and OpenAI currently run 2-2.5 gigawatts each. To match exploding revenue—Anthropic adding $4-6 billion monthly, projecting $60 billion over 10 months at 65% gross margins requiring $40 billion compute ($10 billion/gigawatt)—they need 4 gigawatts more for inference alone, pushing totals above 5 gigawatts by year-end. Training fleets stay flat in projections, but revenue inflection demands aggressive scaling.
"Anthropic needs to get to well above five gigawatts by the end of this year. It’s going to be really tough for them to get there, but it’s possible," says Patel.
OpenAI's Aggressive Deals Outpace Anthropic's Caution
OpenAI locked in compute via broad, risky deals with Microsoft, Google, Amazon, CoreWeave, Oracle, SoftBank Energy, and NScale, even when funding seemed uncertain—causing partner stock dips last year. Anthropic stayed conservative, prioritizing top-tier providers like Google and Amazon to avoid bankruptcy risk, as Dario Amodei noted. Now, with revenue surging, Anthropic pivots to neoclouds, shorter-term contracts, and revenue shares via Bedrock, Vertex, or Azure Foundry.
Last-minute compute means 50% markups: spot H100s at $2-2.40/hour (vs. $1.40 build cost over 5 years, yielding 35%+ margins at $1.90-2.00). Neoclouds hold more H100s from aggressive short-term buys; rolling contracts favor highest bidders. OpenAI ends 2025 with slightly more capacity; both hit 5-6 gigawatts via direct and partner infra.
"OpenAI has got way more access to compute than Anthropic by the end of the year," Patel explains, highlighting how early aggression secures better pricing and reliability over spot markets or revenue shares.
H100 Value Rises Despite Newer GPUs
Michael Burry's 2-3 year GPU depreciation thesis assumes infinite supply and performance leaps (Nvidia tripling flops biennially at 1.5-2x price). TCO models project H100 spot rates falling from $2/hour (2024, 35% margins) to $1 (2026 Blackwell) to $0.70 (2027 Rubin). But supply constraints flip this: H100 utility grows as models like GPT-5.4 run cheaper, sparser MoE architectures on them, serving more higher-quality tokens amid adoption lags and competition.
GPT-4 TAM was billions; GPT-5.4 exceeds $100 billion. Labs can't infinitely deploy newest chips, so H100s price on today's deriveable value, not future alternatives. Result: H100s worth more in 2025 than 2023.
"An H100 is worth more today than it was three years ago," Patel states, countering rapid obsolescence narratives. If AGI arrives, even older nodes like 7nm could revive for flop-equivalent human-brain compute (H100 at 1e15 FLOPS, though memory-limited vs. brain's capacity).
Logic Scaling Hits ASML/TSMC Walls by 2030
Nvidia secured early TSMC allocation, squeezing Google; by 2030, ASML's EUV tools become the top constraint as AI demands explode logic capacity. Older TSMC fabs (e.g., 7nm+) can't fully substitute—lacking density for latest GPUs. China lags in outscaling West due to equipment limits, though advancing.
TSMC may prioritize AI over Apple on N2 node; robots mitigate Taiwan invasion risks by automating fabs.
Incoming Memory Crunch Dwarfs Other Limits
High-bandwidth memory (HBM) faces massive shortages as clusters demand terabytes per rack. Patel forecasts this as the "enormous incoming memory crunch," outpacing logic or power issues.
US Power Scales Without Crisis
Contrary to hype, US power ramps easily—20 gigawatts yearly via gas peakers, nuclear restarts, and grid upgrades. Space GPUs remain sci-fi this decade.
"Scaling power in the US will not be a problem," Patel asserts.
Hedge funds undervalue AGI bets amid these dynamics.
Key Takeaways
- Model CapEx timelines over 3-5 years: 2025's $600B funds 2027-2029 builds like turbines and PPAs, not instant 50GW.
- Secure compute early via aggressive multi-provider deals; spot markets add 50%+ premiums.
- Bet on supply-constrained utility over infinite-supply depreciation—H100s gain value with better software.
- Prioritize ASML/TSMC allocation and HBM stockpiles; logic/memory bottleneck AI by 2030.
- US power isn't the limiter—focus grid deals and peakers for 20GW/year ramps.
- Revenue inflection demands 2-3x inference compute yearly; flat training assumes efficiency gains.
- Diversify beyond hyperscalers: neoclouds like CoreWeave hold excess H100s for quick scaling.
- Watch TSMC priorities—AI trumps consumer like Apple on advanced nodes.
Notable quotes:
- "If you sign a deal at $2/hour for those five years, your gross margin is roughly 35%... Now you can crowd out all of these other suppliers." — Dylan Patel on H100 pricing power.
- "Dario... was very conservative... ‘I don’t want to go bankrupt.’ But in reality, he’s screwed the pooch compared to OpenAI." — Dylan Patel contrasting lab strategies.
- "These labs are in a competitive environment, so their margins can’t go to infinity. You sort of have this dynamic that is quite interesting." — Dylan Patel on GPU value dynamics.
- "ASML will be the #1 constraint for AI compute scaling by 2030." — From timestamps, underscoring lithography limits.