gpus-computepricing-finopsinference-economics

Hyperscalers Commit $690B in AI Capex for 2026

Amazon, Google, Microsoft, Meta, and Oracle have committed $660-690B in 2026 capex, roughly 75% targeting AI infrastructure. AWS broke two decades of declining cloud prices with a 15% GPU hike. Supply remains constrained through year-end as NVIDIA Rubin production ramps slowly and power grid bottlenecks add 24-72 months to new data center timelines.

Digiteria Labs/10 min read

Key Signals

  • The five largest US cloud and AI infrastructure providers — Amazon, Alphabet, Microsoft, Meta, and Oracle — have committed $660-690 billion in combined capex for 2026, nearly doubling 2025's $381 billion.
  • Amazon leads at $200 billion, followed by Alphabet at $175-185B, Microsoft at ~$145B (annualized run rate), Meta at $115-135B, and Oracle at ~$50B. Roughly 75% of the total — over $450 billion — targets AI infrastructure directly: GPUs, servers, and data centers.
  • AWS raised H200 GPU instance prices 15% in January 2026, breaking a two-decade trend of declining cloud compute costs. The p5e.48xlarge jumped from $34.61 to $39.80/hr.
  • NVIDIA confirmed over $500 billion in combined Blackwell and Rubin GPU orders stretching through late 2026, with production-constrained Rubin output likely capped at 200,000-300,000 units this year.
  • Power grid constraints are extending new data center construction timelines by 24 to 72 months, with PJM market capacity pricing spiking from $34/MW-day in 2023 to $329/MW-day in 2026.
  • Amazon's free cash flow is projected to go negative $17-28 billion in 2026. Meta's free cash flow drops nearly 90%. Oracle's free cash flow stays negative until 2030.

What Happened

I've been watching capex announcements for years, and nothing prepared me for these six weeks. Between late Q4 2025 and early Q1 2026, every major hyperscaler dropped capex plans that, taken together, represent the largest infrastructure buildout in the history of technology. Not the history of cloud computing. The history of technology, full stop.

Amazon set the pace at $200 billion. That number exceeds the annual GDP of Portugal. Just let that sit for a second. Google, Microsoft, and Meta each followed with nine-figure commitments that would have been laughed out of a board room two years ago. And Oracle — a company I still instinctively think of as a database vendor — is raising $50 billion in debt and equity specifically to build AI data center capacity for OpenAI, Meta, and xAI.

Here's why those headline numbers matter less than what's underneath them. These commitments are not speculative bets. They are responses to contracted demand. Oracle's $50 billion raise is backed by $523 billion in remaining performance obligations. Every hyperscaler reports that their GPU capacity is supply-constrained, not demand-constrained. The customers in line — enterprises training proprietary models, AI-native startups scaling inference, the hyperscalers themselves consuming GPUs for their own products — have already signed contracts. The money is chasing capacity that does not yet exist.

The thing most people missed happened on a Saturday in January. AWS quietly raised H200 GPU instance prices by 15%. That single move broke a 20-year pattern in cloud computing where prices only went down. I think this is a bigger deal than the capex numbers themselves, because it tells you the supply-demand imbalance is structural, not cyclical. OVHcloud is forecasting 5-10% price increases across all providers by mid-2026, driven by hardware cost inflation of 15-25% and energy costs that have ballooned as data center power demand overwhelms grid capacity.

Note: The real constraint is not silicon — it is electricity. I keep coming back to this because it changes the timeline for everything else. Power availability is extending data center construction timelines by 24 to 72 months in the US. The PJM interconnection (the grid operator covering 13 eastern states) saw capacity pricing spike from $34/MW-day in 2023 to $329/MW-day in 2026. Data centers that were supposed to come online in 2026 are now tracking 2028-2029. Every dollar of committed capex that cannot be deployed on schedule is capital earning zero return. That is a staggering amount of dead money.

The financial strain is already showing, and I think it's worth being blunt about this. Amazon is looking at negative free cash flow of $17-28 billion this year, depending on which analyst you believe. Meta's free cash flow drops nearly 90%. Oracle's stays negative until 2030. These are companies with combined annual revenue exceeding $1.5 trillion, and they are spending faster than they earn. The bet is that AI infrastructure demand is durable enough to justify the debt. If that bet is wrong — or if demand plateaus before the capacity comes online — the write-downs will be historic. (I don't think the bet is wrong, to be clear. But the magnitude of the downside if it is? That keeps me up at night.)

Builder Breakdown

Let Me Walk Through the Supply Chain

GPU Allocation Bottleneck. Here's something that doesn't get enough attention: NVIDIA allocates 60-70% of new GPU production to hyperscalers during the first year of any architecture generation. For Blackwell, that meant the B200 and GB200 went almost entirely to Amazon, Google, Microsoft, Meta, and Oracle through H1 2026. If you're not one of those five companies, you're competing for the remaining 30-40%, and lead times exceed 30 weeks. Rubin, NVIDIA's next-generation architecture with 336 billion transistors and 288GB of HBM4 per GPU, enters volume production in H2 2026 — but I'm not confident the output numbers are going to be encouraging. Production is capped at 200,000-300,000 units by TSMC N3 wafer capacity and HBM4 yield constraints from SK Hynix and Samsung.

What This Actually Means for Your Serving Stack. If you're running inference on cloud GPUs today, I'd plan for three scenarios. The baseline is not great. The constrained scenario is where I think we're actually headed:

# Scenario planning for GPU cost models — 2026
scenarios:
  baseline:
    assumption: "15% price increase holds, no further hikes"
    h200_hourly: 39.80      # AWS p5e.48xlarge post-hike
    b200_hourly: 48.00      # Estimated GA pricing Q3 2026
    annual_impact: "+15% on current GPU spend"

  constrained:
    assumption: "Additional 5-10% mid-year increase per OVH forecast"
    h200_hourly: 42.00
    b200_hourly: 52.00
    annual_impact: "+20-25% on current GPU spend"

  relief:
    assumption: "Rubin volume + tier-2 competition stabilizes pricing"
    h200_hourly: 36.00      # Possible H2 2026 if supply catches up
    b200_hourly: 44.00
    annual_impact: "+5-10% net after migration savings"

Tier-2 Provider Arbitrage. The hyperscaler price increases are creating a real opening for GPU-focused cloud providers, and I think a lot of teams are sleeping on this. Look at the current pricing for H100 80GB:

  • AWS p5.48xlarge: ~$32.77/hr (8x H100)
  • CoreWeave: ~$6.16/hr per H100 ($49.28 for 8x equivalent)
  • Lambda: $2.99/hr per H100 ($23.92 for 8x equivalent)
  • RunPod: $1.99/hr per H100 ($15.92 for 8x equivalent)

Now, the catch — and there's always a catch. Tier-2 providers offer narrower availability zones, less mature networking, and thinner SLAs. For inference workloads that can tolerate occasional availability gaps, the 40-60% savings are absolutely real. For training runs requiring multi-node InfiniBand fabrics with five-nines uptime, the hyperscaler premium still buys you something tangible. My take: if you haven't benchmarked your actual workload on at least one tier-2 provider in the last 90 days, you're leaving money on the table.

Custom Silicon as an Escape Valve. This is where it gets interesting. Google's Ironwood (7th-gen TPU) delivers 4x the performance of its predecessor and is approaching public availability. Google's fully loaded inference cost runs 40-50% lower than NVIDIA GPUs on a third-party cloud. Amazon's Trainium2 is live in EC2. AMD's MI400 series arrives mid-2026 with 432GB HBM4 and 19.6 TB/s memory bandwidth. Broadcom has signed custom ASIC deals with Meta, OpenAI, and Anthropic. The NVIDIA monoculture is fracturing — not because NVIDIA is losing ground technically, but because the supply constraints are forcing buyers to diversify. I've been tracking this diversification trend for months, and it's accelerating faster than I expected.

Economic Analysis

Winners and Losers in the $690B Buildout

I want to be direct about who benefits and who gets hurt here. The press coverage focuses on the headline numbers, but the second-order effects are where the real story is.

Winners:

  • NVIDIA dominates with $500B+ in combined Blackwell/Rubin backlog. Every dollar of hyperscaler capex flows through their order book first, and supply constraints give them extraordinary pricing power. It's good to be the bottleneck.
  • Tier-2 GPU cloud providers (CoreWeave, Lambda, RunPod) benefit directly from hyperscaler price increases. Every 15% AWS hike pushes cost-conscious teams to evaluate alternatives. Lambda at $2.99/hr vs. AWS at $32.77/8-GPU equivalent — that's a conversation happening in every infrastructure review right now, and it wasn't six months ago.
  • Power and cooling companies. Data center construction costs hit $11.3M per MW in 2026, up from $7.7M in 2020. Companies that can deliver power infrastructure — from grid interconnection to on-site generation — are the actual bottleneck. And bottlenecks capture margin. Always.
  • Custom silicon vendors (Broadcom, Marvell, Google TPU, AWS Trainium). The supply crunch is accelerating diversification faster than any technical argument could. I think by late 2026, the majority of frontier model training will occur on custom ASICs rather than merchant GPUs. That's a sea change.

Losers:

  • Enterprises locked into hyperscaler committed-use contracts. Here's the part that makes me wince. Enterprise Discount Programs typically guarantee discounts off public pricing — so when public pricing rises 15%, your "discounted" rate increases in absolute dollars even if the percentage holds. And your renegotiation leverage is minimal when every provider is supply-constrained.
  • Mid-market AI companies without the scale to negotiate priority GPU allocations or the engineering capacity to migrate to custom silicon. You pay retail pricing in a seller's market. That's a rough place to be.
  • Hyperscaler shareholders in the near term. Negative free cash flow at Amazon, 90% FCF decline at Meta, Oracle borrowing through 2030 — the market is pricing in eventual returns, but the payback period keeps extending. I'm not sure the market is pricing this risk correctly.
  • The US power grid. 70% of US grid infrastructure is approaching end-of-life. AI data center demand is adding load faster than generation and transmission capacity can expand. Ratepayers in PJM states are already seeing the $9.3B in added capacity costs flow through to electricity bills. This is the externality nobody in tech wants to talk about.

"When AWS raises GPU prices for the first time in two decades, it is not a pricing decision — it is a supply signal. The era of assuming cloud compute gets cheaper every year is over for AI workloads."

Note: I want to be honest about the biggest risk here. The $690B capex wave assumes AI demand is durable and growing. If enterprise AI adoption plateaus — or if a DeepSeek-style efficiency breakthrough reduces the compute required per unit of intelligence — hyperscalers will be sitting on hundreds of billions in infrastructure earning sub-target returns. The parallel to the 2000 telecom fiber overbuild is uncomfortable, and I keep thinking about it: supply was eventually absorbed, but not before $2 trillion in shareholder value evaporated. The difference is that today's hyperscalers have diversified revenue bases. The similarity is that capital deployed ahead of demand always carries the risk that demand never catches up on schedule. The data on whether current AI deployments are generating enough ROI to sustain this spending rate? It's thin. I'm watching enterprise renewal rates closely, and I'd encourage you to as well.

Recommendation

What I'd Do

If you're a CTO: Budget for 10-15% cloud GPU cost increases in 2026. Don't assume the AWS hike is an anomaly — it's the new baseline. If your GPU spend exceeds $100K/month, run a formal evaluation of tier-2 providers (Lambda, CoreWeave, RunPod) for inference workloads within the next 60 days. For training, evaluate Google TPU Ironwood and AWS Trainium2 as secondary compute targets. Lock in 6-12 month reserved capacity commitments now; spot availability will tighten through H2 as Rubin production ramps slowly. This is the kind of thing that's easy to push to next quarter. Don't.

If you're a founder: Don't overbuild GPU infrastructure in a rising-cost environment. If your inference bill is under $50K/month, stay on managed API providers (Together, Fireworks, Groq) where the provider absorbs the GPU cost risk. If you're raising capital, model your unit economics with a 20% compute cost increase scenario — investors who've seen the AWS price hike will ask about it. The companies that survive rising infrastructure costs are the ones whose revenue-per-GPU-hour exceeds the cost inflation curve. Everything else is a bet on prices coming back down, and I wouldn't make that bet right now.

If you're an infra lead: Build a multi-provider GPU strategy this quarter. I mean actually build it, not just talk about it in planning meetings. Run benchmarks on your top-3 workloads across at least two providers outside your primary hyperscaler. Set up cost monitoring that tracks $/token and $/GPU-hour weekly, not monthly — pricing is volatile enough now to warrant higher-frequency attention. Evaluate reserved instance commitments carefully: in a supply-constrained market, reservations lock in today's price, but they also lock you into a single provider when the landscape is shifting toward custom silicon. My rule of thumb: no more than 60% of your GPU spend on committed contracts. Keep 40% flexible. You'll thank yourself in Q4.

Sources

  1. "Tech AI spending approaches $700 billion in 2026, cash taking big hit," CNBC, cnbc.com/2026/02/06/google-microsoft-meta-amazon-ai-cash.html (accessed Feb 2026)
  2. "AI Capex 2026: The $690B Infrastructure Sprint," Futurum Group, futurumgroup.com/insights/ai-capex-2026-the-690b-infrastructure-sprint/ (accessed Feb 2026)
  3. "AWS Raises GPU Prices 15%: A Signal of Structural Supply Constraints," Introl, introl.com/blog/aws-gpu-price-increase-h200-january-2026 (accessed Feb 2026)
  4. "NVIDIA Rubin Enters Full Production: The 336 Billion Transistor GPU Reshaping AI Infrastructure," Introl, introl.com/blog/nvidia-rubin-full-production-ces-2026-ai-infrastructure (accessed Feb 2026)
  5. "Google, Meta, Microsoft, Amazon spending continues to rise," Axios, axios.com/2026/02/11/hyperscaler-spending-meta-microsoft-amazon-google (accessed Feb 2026)
  6. "Oracle Eyes $50B for AI Infrastructure in 2026," Data Center Knowledge, datacenterknowledge.com/infrastructure/oracle-eyes-50-billion-for-ai-infrastructure-in-2026 (accessed Feb 2026)
  7. "2026 Predictions: AI Sparks Data Center Power Revolution," Data Center Knowledge, datacenterknowledge.com/operations-and-management/2026-predictions-ai-sparks-data-center-power-revolution (accessed Feb 2026)

Need help implementing AI infrastructure for your organization? We help enterprises build, deploy, and optimize production AI systems. Learn about our AI consulting services.

Related insights