What is the price of an Nvidia H200 GPU in 2026?

The Nvidia H200 GPU sells for $30,000–$40,000 per unit on the street market in 2026. Cloud rental ranges from $3.00–$10.00/hr on major platforms, with neoclouds like CoreWeave and Lambda pricing reserved H200 capacity at $3.29–$3.50/hr — roughly 55% cheaper than AWS on-demand at $7.95/hr. The newer B200 sells for $40,000–$50,000 per unit and rents for $5.49–$12.00/hr depending on provider.

Nvidia vs AMD vs Google TPU: which AI chip is best?

Nvidia dominates frontier model training with 92% market share in 2026, driven by the mature CUDA software stack and H200/B200's 1,979–4,500 TFLOPs of FP16 compute. AMD MI300X is the best alternative for inference and cost-sensitive workloads — 192GB HBM3 at $10,000–$15,000/unit (60% cheaper than H200) with ROCm 6.x now near CUDA parity for transformer inference. Google TPU v5p (459 TFLOPs, $2.00–$4.20/chip-hour) only makes sense inside Google Cloud. The rule of thumb: Nvidia for training, AMD for high-volume inference at scale, TPU only if you're already on GCP.

What is Nvidia's AI chip market share in 2026?

Nvidia holds approximately 92% of AI training compute market share in 2026, down from 95%+ in 2023. AMD MI300X has captured 4–5% (primarily at Meta, Microsoft, and Oracle), Google TPU accounts for ~2% (first-party only), and AWS Trainium2/Microsoft Maia cover the remaining 1–2%. Nvidia's data center revenue hit $115B in fiscal 2025 and is tracking toward $180–200B in fiscal 2026 — roughly 4x AMD's entire data center business. The 92% share is likely the peak as hyperscaler internal silicon and AMD ROCm improvements erode the moat by 2028.

How much does it cost to run a 1,024 GPU AI training cluster?

A 1,024-GPU H200 cluster costs roughly $36M in chip capex ($35K avg per chip), plus $4M for InfiniBand networking, $2.1M/year in power, $10–12M for datacenter buildout, and $3–5M in software engineering — totaling $55–60M all-in for the first year, or $53–72M equivalent in reserved cloud costs. The same cluster with AMD MI300X chips costs ~$13M in capex (64% cheaper) but adds $8–15M in ROCm porting and software engineering costs, closing much of the gap. The breakeven for switching from Nvidia to AMD typically only makes sense above $10M of annual compute spend.

Will AMD overtake Nvidia in AI chips?

AMD will not overtake Nvidia in AI training chips by 2028, but will likely capture 15–20% of total AI compute spend (up from 4–5% today) as inference workloads grow. Three structural shifts drive AMD's gain: ROCm 6.x hitting CUDA parity for transformer inference in late 2025, hyperscaler internal silicon (Google TPU, AWS Trainium, Microsoft Maia) absorbing 18–22% of internal workloads by 2027, and inference-only chips from Groq and Cerebras pricing tokens 5–10x cheaper than Nvidia for specific model shapes. Nvidia remains the only viable option for frontier pretraining at the 1T+ parameter scale.

AI Chip Wars: $300B Market Showdown 2026

92% of AI training workloads ran on Nvidia GPUs in Q1 2026 — but AMD AMD Instinct is shipping at one-third the price with 2.4x the memory, and Google TPU v5p just hit 459 TFLOPs per chip. The chip war is no longer just about FLOPs.

That's the short answer. The longer answer is that the gap between Nvidia and everyone else is closing on memory, on price-per-token of inference, and on hyperscaler willingness to deploy non-CUDA silicon. The Nvidia's latest GPUs still wins for training a frontier foundation model from scratch. For everything else — inference, fine-tuning, vertical AI — the math has changed.

AI Hardware Wars: Nvidia Nvidia's latest GPUs vs AMD AMD Instinct vs Google TPU v5 in 2026

Nvidia Nvidia's latest GPUs is still the default training accelerator in 2026, with 141GB HBM3e, 4.8TB/s memory bandwidth, and an effective monopoly on the CUDA software stack that the largest labs are built on. AMD AMD Instinct is the only credible alternative on price-per-memory, shipping at $10K-$15K with 192GB HBM3. Google TPU v5p is competitive on training cost inside Google Cloud only — it has no on-prem footprint and no third-party sale. Across the full $300B-plus 2025-2026 AI capex cycle, Nvidia captured roughly 75-80% of the spend.

Spec	Nvidia Nvidia's latest GPUs	Nvidia B200	AMD AMD Instinct	Google TPU v5p
Memory	141GB HBM3e	192GB HBM3e	192GB HBM3	95GB HBM2e
Memory bandwidth	4.8 TB/s	8.0 TB/s	5.3 TB/s	2.8 TB/s
FP16/BF16 TFLOPs	1,979	4,500	1,307	459
Street price (per unit)	$30K-$40K	$40K-$50K	$10K-$15K	N/A (cloud only)
Cloud rental ($/hr)	$3.00-$10.00	$6.00-$12.00	$1.99-$4.50	$2.00-$4.20
Software stack	CUDA (mature)	CUDA (mature)	ROCm 6.x	JAX/XLA
Lead time	8-16 weeks	12-24 weeks	4-8 weeks	Cloud only
Power per chip	700W	1,000W	750W	350W

The AI Chip Market Share Picture in 2026

Nvidia's data center revenue hit $115B in fiscal 2025 — up from $47B in fiscal 2024 — and is tracking toward $180B-$200B in fiscal 2026. That single line item is roughly 4x the entire prior AMD data center business and 12x the AMD Instinct line. But the share trajectory tells a different story than the absolute dollars. Track how AI company valuations are being priced against this revenue growth on our AI valuations tracker.

92%

Nvidia

Down from 95%+ in 2023

4-5%

AMD

Meta, MSFT, Oracle wins

Google TPU

First-party only

1-2%

AWS / Microsoft

Trainium2, Maia

AMD's AMD Instinct has secured material commitments: Microsoft is running Bing and Copilot inference on tens of thousands of AMD Instinct units, Meta is running 40%+ of Llama inference on AMD, and Oracle Cloud added 30,000+ AMD Instinct chips in 2025. Google's TPU v5p quietly powers most internal Gemini training while Anthropic — Google's biggest external customer — runs Claude inference on a mixed TPU and Nvidia fleet under a multi-year cloud commit worth over $4B.

Total Cost of Ownership: Nvidia's latest GPUs vs AMD Instinct vs TPU v5p

Sticker price hides the real spread. A 1,024-GPU training cluster running for 12 months has dramatically different economics depending on which platform you pick — once you add networking, software engineering, and power. Here is what a frontier-lab-scale build actually costs in 2026.

Cost line	Nvidia's latest GPUs cluster	AMD Instinct cluster	TPU v5p pod
1,024 chips (capex)	$36M ($35K avg)	$13M ($12.5K avg)	Cloud only
Networking (InfiniBand/RoCE)	$4M	$4M	Included
Power per year (1,024 × 700-1000W)	$2.1M	$2.3M	Included
Datacenter buildout (per MW)	$10M-$12M	$10M-$12M	Included
Software engineering (1 year)	$3M-$5M	$8M-$15M	$5M-$8M
Reserved cloud equivalent (12mo)	$53M-$72M	$24M-$36M	$30M-$50M

AMD Instinct looks 60% cheaper at the chip line but loses 30-40% of that gap to ROCm porting and slower kernel optimization. For a startup with under $10M of compute spend, the breakeven is rarely there. For Meta — which spent $39B on Llama training and inference — saving even 25% of capex is worth absorbing $200M of software engineering cost. That is the actual reason AMD Instinct is winning at hyperscale and losing at startup scale.

Training vs Inference: Why the Hardware Question Splits in 2026

The 2024 conversation was almost entirely about training FLOPs. In 2026 the inference market is roughly 60% of total AI compute spend — OpenAI alone is reportedly burning $7B-$9B per year on inference for ChatGPT, and Anthropic Claude is spending $3B-$4B. That shift completely changes the chip math.

Where Nvidia still dominates

• Frontier model pretraining (1T+ params)
• Multi-node training with NVLink/InfiniBand
• Reinforcement learning fine-tuning
• Any workload that needs CUDA + TensorRT
• Research labs at OpenAI, Anthropic, xAI

Where AMD and TPU are gaining

• High-volume inference (AMD Instinct 192GB shines)
• Single-node fine-tuning of open models
• Cost-sensitive batch inference
• First-party clouds with controlled stack
• Anything Meta, Microsoft, or Google runs

AI Hardware Pricing in 2026: What Each GPU Actually Costs Today

Cloud rental is now where the chip war is actually fought. Here are 2026 reserved-pricing benchmarks on AWS, Azure, GCP, and the neoclouds (CoreWeave, Lambda, Crusoe, Voltage Park), pulled from public listing prices and enterprise quotes.

Provider	H100 ($/hr)	Nvidia's latest GPUs ($/hr)	B200 ($/hr)	AMD Instinct ($/hr)
AWS (on-demand)	$5.20	$7.95	$11.50	$4.10
Azure (on-demand)	$4.90	$7.40	$10.80	N/A
GCP (on-demand)	$5.50	$8.20	$12.00	N/A
CoreWeave (reserved)	$2.49	$3.50	$5.95	$2.10
Lambda (reserved)	$2.49	$3.29	$5.49	$1.99
Crusoe (reserved)	$2.45	$3.40	$5.85	$1.95

The neoclouds are pricing H100 at $2.49/hr against AWS at $5.20/hr — a 52% discount. The gap is bigger on Nvidia's latest GPUs. CoreWeave, Lambda, and Crusoe collectively burned through $30B+ of GPU capex in 2024-2025 and need utilization, which is why every spot price war runs through them first. For startups, the play is almost always: lock 70% reserved on a neocloud, top up with hyperscaler on-demand for spikes.

Who's Winning the AI Hardware War: The Honest Call

Nvidia is winning revenue and margin. Gross margin sat at 75%+ through fiscal 2025, and the Nvidia's latest GPUs/B200/GB200 product cadence is keeping the next 24 months booked. But three trend lines bend against Nvidia between now and 2028:

Hyperscaler internal silicon

Google TPU v5p, AWS Trainium2, Microsoft Maia 100, and Meta MTIA collectively absorbed roughly 8% of internal AI workloads in 2025 — projected to hit 18-22% by 2027. That is share Nvidia structurally cannot recover, because the buyer also builds.

AMD ROCm catching up

ROCm 6.x finally hit performance parity with CUDA for transformer inference in late 2025. The 2-year software gap is closing to roughly 9-12 months. Once it hits parity, the $20K-$25K per-chip price gap becomes purely a function of buyer switching cost.

Inference moving to commodity

Inference-only chips from Groq, Cerebras, SambaNova, and d-Matrix are pricing tokens 5-10x cheaper than Nvidia for specific model shapes. None will replace Nvidia for training, but each chips away at the 60% of compute that is inference.

Nvidia wins the 2026 AI hardware war on revenue, margin, and training workloads.

But the 92% share is the peak. AMD, hyperscaler silicon, and inference specialists are about to pull 15-20 points away by 2028 — and that is where the actual investable trade lives.

Track AI capex and chip vendor revenue on the AI Spending Dashboard and AI Chip Wars at Value Add VC. Originally published in the Trace Cohen newsletter.

Get VC data most people never see

— 100% free

Weekly benchmarks, valuations, and fund data. Join 5,000+ investors. No spam.

AI Hardware Wars: Nvidia Nvidia's latest GPUs vs AMD AMD Instinct vs Google TPU v5 in 2026

Spec	Nvidia Nvidia's latest GPUs	Nvidia B200	AMD AMD Instinct	Google TPU v5p
Memory	141GB HBM3e	192GB HBM3e	192GB HBM3	95GB HBM2e
Memory bandwidth	4.8 TB/s	8.0 TB/s	5.3 TB/s	2.8 TB/s
FP16/BF16 TFLOPs	1,979	4,500	1,307	459
Street price (per unit)	$30K-$40K	$40K-$50K	$10K-$15K	N/A (cloud only)
Cloud rental ($/hr)	$3.00-$10.00	$6.00-$12.00	$1.99-$4.50	$2.00-$4.20
Software stack	CUDA (mature)	CUDA (mature)	ROCm 6.x	JAX/XLA
Lead time	8-16 weeks	12-24 weeks	4-8 weeks	Cloud only
Power per chip	700W	1,000W	750W	350W

The AI Chip Market Share Picture in 2026

92%

Nvidia

Down from 95%+ in 2023

4-5%

AMD

Meta, MSFT, Oracle wins

Google TPU

First-party only

1-2%

AWS / Microsoft

Trainium2, Maia

Total Cost of Ownership: Nvidia's latest GPUs vs AMD Instinct vs TPU v5p

Cost line	Nvidia's latest GPUs cluster	AMD Instinct cluster	TPU v5p pod
1,024 chips (capex)	$36M ($35K avg)	$13M ($12.5K avg)	Cloud only
Networking (InfiniBand/RoCE)	$4M	$4M	Included
Power per year (1,024 × 700-1000W)	$2.1M	$2.3M	Included
Datacenter buildout (per MW)	$10M-$12M	$10M-$12M	Included
Software engineering (1 year)	$3M-$5M	$8M-$15M	$5M-$8M
Reserved cloud equivalent (12mo)	$53M-$72M	$24M-$36M	$30M-$50M

Training vs Inference: Why the Hardware Question Splits in 2026

Where Nvidia still dominates

• Frontier model pretraining (1T+ params)
• Multi-node training with NVLink/InfiniBand
• Reinforcement learning fine-tuning
• Any workload that needs CUDA + TensorRT
• Research labs at OpenAI, Anthropic, xAI

Where AMD and TPU are gaining

• High-volume inference (AMD Instinct 192GB shines)
• Single-node fine-tuning of open models
• Cost-sensitive batch inference
• First-party clouds with controlled stack
• Anything Meta, Microsoft, or Google runs

AI Hardware Pricing in 2026: What Each GPU Actually Costs Today

Provider	H100 ($/hr)	Nvidia's latest GPUs ($/hr)	B200 ($/hr)	AMD Instinct ($/hr)
AWS (on-demand)	$5.20	$7.95	$11.50	$4.10
Azure (on-demand)	$4.90	$7.40	$10.80	N/A
GCP (on-demand)	$5.50	$8.20	$12.00	N/A
CoreWeave (reserved)	$2.49	$3.50	$5.95	$2.10
Lambda (reserved)	$2.49	$3.29	$5.49	$1.99
Crusoe (reserved)	$2.45	$3.40	$5.85	$1.95

Who's Winning the AI Hardware War: The Honest Call

Hyperscaler internal silicon

AMD ROCm catching up

Inference moving to commodity

Nvidia wins the 2026 AI hardware war on revenue, margin, and training workloads.

But the 92% share is the peak. AMD, hyperscaler silicon, and inference specialists are about to pull 15-20 points away by 2028 — and that is where the actual investable trade lives.

Track AI capex and chip vendor revenue on the AI Spending Dashboard and AI Chip Wars at Value Add VC. Originally published in the Trace Cohen newsletter.

Get VC data most people never see

— 100% free

Weekly benchmarks, valuations, and fund data. Join 5,000+ investors. No spam.

AI Hardware Wars 2026: Nvidia H200 vs AMD MI300 vs Google TPU v5

AI Hardware Wars: Nvidia Nvidia's latest GPUs vs AMD AMD Instinct vs Google TPU v5 in 2026

The AI Chip Market Share Picture in 2026

Total Cost of Ownership: Nvidia's latest GPUs vs AMD Instinct vs TPU v5p

Training vs Inference: Why the Hardware Question Splits in 2026

AI Hardware Pricing in 2026: What Each GPU Actually Costs Today

Who's Winning the AI Hardware War: The Honest Call

Frequently Asked Questions

Related Tools & Dashboards

Keep Reading

AI Hardware Wars 2026: Nvidia H200 vs AMD MI300 vs Google TPU v5

AI Hardware Wars: Nvidia Nvidia's latest GPUs vs AMD AMD Instinct vs Google TPU v5 in 2026

The AI Chip Market Share Picture in 2026

Total Cost of Ownership: Nvidia's latest GPUs vs AMD Instinct vs TPU v5p

Training vs Inference: Why the Hardware Question Splits in 2026

AI Hardware Pricing in 2026: What Each GPU Actually Costs Today

Who's Winning the AI Hardware War: The Honest Call

Frequently Asked Questions

Related Tools & Dashboards

Keep Reading