AI & TechnologyJune 9, 2026·11 min read·Last updated: June 9, 2026

AI Hardware Wars 2026: Nvidia H200 vs AMD MI300 vs Google TPU v5

The head-to-head data on pricing, memory, FLOPs, and market share — and an honest call on which AI chip platform is actually winning the $300B accelerator market in 2026.

TC
Trace Cohen
Co-Founder & GP at Six Point Ventures · 3x founder (BrandYourself, Launch.it, SPOT) · 65+ investments · Based in Boca Raton, FL

Quick Answer

92% of AI training workloads run on Nvidia GPUs in 2026. The H200 sells at $30K-$40K per unit, B200 at $40K-$50K, while AMD MI300X is the only real alternative at $10K-$15K with 192GB HBM3. Google TPU v5p ($2-$4.20/chip-hour) only matters inside Google Cloud. Nvidia is winning — but the price gap is closing fast and inference workloads are where AMD and custom silicon catch up.

92% of AI training workloads ran on Nvidia GPUs in Q1 2026 — but AMD MI300X is shipping at one-third the price with 2.4x the memory, and Google TPU v5p just hit 459 TFLOPs per chip. The chip war is no longer just about FLOPs.

That's the short answer. The longer answer is that the gap between Nvidia and everyone else is closing on memory, on price-per-token of inference, and on hyperscaler willingness to deploy non-CUDA silicon. The H200 still wins for training a frontier foundation model from scratch. For everything else — inference, fine-tuning, vertical AI — the math has changed.

AI Hardware Wars: Nvidia H200 vs AMD MI300 vs Google TPU v5 in 2026

Nvidia H200 is still the default training accelerator in 2026, with 141GB HBM3e, 4.8TB/s memory bandwidth, and an effective monopoly on the CUDA software stack that the largest labs are built on. AMD MI300X is the only credible alternative on price-per-memory, shipping at $10K-$15K with 192GB HBM3. Google TPU v5p is competitive on training cost inside Google Cloud only — it has no on-prem footprint and no third-party sale. Across the full $300B-plus 2025-2026 AI capex cycle, Nvidia captured roughly 75-80% of the spend.

SpecNvidia H200Nvidia B200AMD MI300XGoogle TPU v5p
Memory141GB HBM3e192GB HBM3e192GB HBM395GB HBM2e
Memory bandwidth4.8 TB/s8.0 TB/s5.3 TB/s2.8 TB/s
FP16/BF16 TFLOPs1,9794,5001,307459
Street price (per unit)$30K-$40K$40K-$50K$10K-$15KN/A (cloud only)
Cloud rental ($/hr)$3.00-$10.00$6.00-$12.00$1.99-$4.50$2.00-$4.20
Software stackCUDA (mature)CUDA (mature)ROCm 6.xJAX/XLA
Lead time8-16 weeks12-24 weeks4-8 weeksCloud only
Power per chip700W1,000W750W350W

The AI Chip Market Share Picture in 2026

Nvidia's data center revenue hit $115B in fiscal 2025 — up from $47B in fiscal 2024 — and is tracking toward $180B-$200B in fiscal 2026. That single line item is roughly 4x the entire prior AMD data center business and 12x the AMD Instinct line. But the share trajectory tells a different story than the absolute dollars.

92%

Nvidia

Down from 95%+ in 2023

4-5%

AMD

Meta, MSFT, Oracle wins

2%

Google TPU

First-party only

1-2%

AWS / Microsoft

Trainium2, Maia

AMD's MI300X has secured material commitments: Microsoft is running Bing and Copilot inference on tens of thousands of MI300X units, Meta is running 40%+ of Llama 4 inference on AMD, and Oracle Cloud added 30,000+ MI300X chips in 2025. Google's TPU v5p quietly powers most internal Gemini training while Anthropic — Google's biggest external customer — runs Claude inference on a mixed TPU and Nvidia fleet under a multi-year cloud commit worth over $4B.

Total Cost of Ownership: H200 vs MI300X vs TPU v5p

Sticker price hides the real spread. A 1,024-GPU training cluster running for 12 months has dramatically different economics depending on which platform you pick — once you add networking, software engineering, and power. Here is what a frontier-lab-scale build actually costs in 2026.

Cost lineH200 clusterMI300X clusterTPU v5p pod
1,024 chips (capex)$36M ($35K avg)$13M ($12.5K avg)Cloud only
Networking (InfiniBand/RoCE)$4M$4MIncluded
Power per year (1,024 × 700-1000W)$2.1M$2.3MIncluded
Datacenter buildout (per MW)$10M-$12M$10M-$12MIncluded
Software engineering (1 year)$3M-$5M$8M-$15M$5M-$8M
Reserved cloud equivalent (12mo)$53M-$72M$24M-$36M$30M-$50M

MI300X looks 60% cheaper at the chip line but loses 30-40% of that gap to ROCm porting and slower kernel optimization. For a startup with under $10M of compute spend, the breakeven is rarely there. For Meta — which spent $39B on Llama 4 training and inference — saving even 25% of capex is worth absorbing $200M of software engineering cost. That is the actual reason MI300X is winning at hyperscale and losing at startup scale.

Training vs Inference: Why the Hardware Question Splits in 2026

The 2024 conversation was almost entirely about training FLOPs. In 2026 the inference market is roughly 60% of total AI compute spend — OpenAI alone is reportedly burning $7B-$9B per year on inference for ChatGPT, and Anthropic Claude is spending $3B-$4B. That shift completely changes the chip math.

Where Nvidia still dominates

  • • Frontier model pretraining (1T+ params)
  • • Multi-node training with NVLink/InfiniBand
  • • Reinforcement learning fine-tuning
  • • Any workload that needs CUDA + TensorRT
  • • Research labs at OpenAI, Anthropic, xAI

Where AMD and TPU are gaining

  • • High-volume inference (MI300X 192GB shines)
  • • Single-node fine-tuning of open models
  • • Cost-sensitive batch inference
  • • First-party clouds with controlled stack
  • • Anything Meta, Microsoft, or Google runs

AI Hardware Pricing in 2026: What Each GPU Actually Costs Today

Cloud rental is now where the chip war is actually fought. Here are 2026 reserved-pricing benchmarks on AWS, Azure, GCP, and the neoclouds (CoreWeave, Lambda, Crusoe, Voltage Park), pulled from public listing prices and enterprise quotes.

ProviderH100 ($/hr)H200 ($/hr)B200 ($/hr)MI300X ($/hr)
AWS (on-demand)$5.20$7.95$11.50$4.10
Azure (on-demand)$4.90$7.40$10.80N/A
GCP (on-demand)$5.50$8.20$12.00N/A
CoreWeave (reserved)$2.49$3.50$5.95$2.10
Lambda (reserved)$2.49$3.29$5.49$1.99
Crusoe (reserved)$2.45$3.40$5.85$1.95

The neoclouds are pricing H100 at $2.49/hr against AWS at $5.20/hr — a 52% discount. The gap is bigger on H200. CoreWeave, Lambda, and Crusoe collectively burned through $30B+ of GPU capex in 2024-2025 and need utilization, which is why every spot price war runs through them first. For startups, the play is almost always: lock 70% reserved on a neocloud, top up with hyperscaler on-demand for spikes.

Who's Winning the AI Hardware War: The Honest Call

Nvidia is winning revenue and margin. Gross margin sat at 75%+ through fiscal 2025, and the H200/B200/GB200 product cadence is keeping the next 24 months booked. But three trend lines bend against Nvidia between now and 2028:

1

Hyperscaler internal silicon

Google TPU v5p, AWS Trainium2, Microsoft Maia 100, and Meta MTIA collectively absorbed roughly 8% of internal AI workloads in 2025 — projected to hit 18-22% by 2027. That is share Nvidia structurally cannot recover, because the buyer also builds.

2

AMD ROCm catching up

ROCm 6.x finally hit performance parity with CUDA for transformer inference in late 2025. The 2-year software gap is closing to roughly 9-12 months. Once it hits parity, the $20K-$25K per-chip price gap becomes purely a function of buyer switching cost.

3

Inference moving to commodity

Inference-only chips from Groq, Cerebras, SambaNova, and d-Matrix are pricing tokens 5-10x cheaper than Nvidia for specific model shapes. None will replace Nvidia for training, but each chips away at the 60% of compute that is inference.

Nvidia wins the 2026 AI hardware war on revenue, margin, and training workloads.

But the 92% share is the peak. AMD, hyperscaler silicon, and inference specialists are about to pull 15-20 points away by 2028 — and that is where the actual investable trade lives.

Track AI capex and chip vendor revenue on the AI Spending Dashboard and AI Chip Wars at Value Add VC. Originally published in the Trace Cohen newsletter.

Frequently Asked Questions

What is the difference between Nvidia H200 and B200?

The H200 ships with 141GB HBM3e and 4.8TB/s memory bandwidth on the Hopper architecture, while the B200 (Blackwell) carries 192GB HBM3e at 8TB/s with roughly 2.5x the training throughput. Street price for H200 sits at $30,000-$40,000 versus $40,000-$50,000 for B200. Hyperscalers buying GB200 NVL72 racks (72 GPUs + 36 Grace CPUs) pay $3M+ per rack with 12-24 month lead times.

Is AMD MI300X cheaper than Nvidia H100 in 2026?

Yes. AMD MI300X street price is roughly $10,000-$15,000 per GPU compared to $25,000-$30,000 for the H100 on the secondary market. MI300X also ships with 192GB HBM3 (vs 80GB on H100), which means fewer GPUs per training run for large models. The catch is software: CUDA still has a 2-3 year lead on ROCm, so engineering cost to port often eats most of the per-unit savings.

How does Google TPU v5p compare to Nvidia H200?

Google TPU v5p delivers 459 TFLOPs of bf16 compute and 95GB of HBM per chip, scaling to pods of 8,960 chips with 4,800 Gbps interconnect. Per-chip-hour pricing on Google Cloud is roughly $2.0-$4.20 depending on commitment, versus $2.49-$8 for an H100 and $3-$10 for an H200 on AWS/Azure. TPU v5p only matters if you are training inside Google Cloud — it has zero on-prem footprint.

What is the AI chip market share in 2026?

Nvidia held roughly 92% of the AI training accelerator market by revenue as of Q1 2026, down from a peak above 95% in 2023. AMD has captured 4-5% behind Meta, Microsoft, and Oracle deployments of MI300X and MI325X. Google TPU, AWS Trainium, and Microsoft Maia together account for the remaining 3-4%, primarily for internal first-party workloads rather than external sale.

Which AI chip should startups buy in 2026?

Most startups should not buy chips at all — they should rent H100 or H200 capacity from CoreWeave, Lambda, or Crusoe at $2-$4 per GPU-hour for spot and $4-$8 reserved. If you have committed multi-year inference spend above $5M per year, MI300X economics start to make sense. Sub-$10M ARR companies that buy hardware almost always regret it within 18 months as the next chip generation drops list prices 30-50%.

Explore 45+ free VC tools, dashboards, and recommended startup software.