AI & TechnologyJune 6, 2026·9 min read·Last updated: June 6, 2026

Cooling Technology for AI Data Centers: The $50B Market Nobody Is Talking About

Everyone is debating which AI models win. Nobody is talking about the thermal problem that determines whether those models can run at scale. Cooling is the $50B picks-and-shovels play hiding in plain sight.

TC
Trace Cohen
3x founder, 65+ investments, building Value Add VC

Quick Answer

AI data center cooling technology is a $50B+ market by 2030, driven by NVIDIA H100/H200/B200 GPUs generating 700W–1000W+ per chip — 10x what traditional server CPUs produce. Air cooling fails above 30–40kW per rack. Liquid cooling (direct-to-chip, immersion, rear-door heat exchangers) is the only viable path for hyperscale AI. Key players: Vertiv, nVent, Asetek, CoolIT Systems, and GRC.

An NVIDIA H100 GPU generates 700 watts of heat. A B200 generates 1,000 watts. Pack 72 of them into a single GB200 NVL72 rack and you have a 120kW thermal problem that no air handler in any existing data center was designed to solve.

This isn't a future problem. It's today's problem. Microsoft, Google, Meta, and Amazon are collectively spending over $300B on AI infrastructure in 2025 — and every dollar they spend on GPUs creates a proportional spending requirement on cooling. The market for AI data center cooling technology is projected to reach $50B+ annually by 2030, growing at a 25–30% CAGR from roughly $15B in 2024.

Yet almost no one in VC is talking about it. Everyone is obsessing over model benchmarks and foundation model funding rounds. The thermal infrastructure enabling all of it is the most durable picks-and-shovels opportunity in the AI buildout — and it's hiding in plain sight.

Why Air Cooling Has Already Failed for AI

Traditional data center air cooling was designed for racks drawing 5–15kW. A typical enterprise server rack in 2015 consumed 8–12kW. Computer room air conditioners (CRACs) and raised-floor plenum systems work fine at that density.

AI GPU clusters broke that model completely. Here's what the power density numbers look like across generations:

HardwareTDP per chipRack power (typical)Cooling required
Intel Xeon (2020)250W~8kWAir (CRAC)
NVIDIA A100400W~25kWAir (high-density)
NVIDIA H100 SXM5700W~40–60kWLiquid (DLC required)
NVIDIA B2001,000W~80–100kWLiquid (immersion viable)
NVIDIA GB200 NVL72system-level~120kWLiquid (mandatory)

Source: NVIDIA product specs; data center engineering literature. GB200 NVL72 power assumes full cluster configuration.

The Three Technologies Competing for the AI Cooling Market

Not all liquid cooling is created equal. Three distinct approaches are competing — each with different economics, compatibility profiles, and market penetration timelines.

Direct Liquid Cooling (DLC)
Fastest growing — ~60% of new AI installations

Cold plates attach directly to CPUs, GPUs, and memory. Coolant (typically water or glycol) circulates through the plate and carries heat to a facility-level chiller. NVIDIA designs GB200 systems with DLC as the default. Dell, HPE, and Supermicro all offer DLC server lines. Retrofit-compatible with most data center buildings — the biggest go-to-market advantage.

Best for: New hyperscale AI clusters, HPC retrofits, colocation providers adding high-density capacity

Rear-Door Heat Exchangers (RDHx)
Mature technology — bridge for existing air-cooled facilities

A chilled water coil replaces the standard rack rear door. Hot air exiting the rack passes through the coil and is cooled before mixing with room air. Handles 15–40kW per rack without facility-level plumbing changes. Popular for organizations upgrading existing data centers that can't justify full liquid cooling infrastructure overhaul.

Best for: Enterprise data centers upgrading existing infrastructure, colocation retrofits

Immersion Cooling (Single-Phase & Two-Phase)
High growth from small base — ~10% of AI installations today

Servers are submerged in dielectric fluid (single-phase: mineral oil or synthetic; two-phase: fluorinert or Novec). Heat transfers directly from chips to fluid, achieving PUE of 1.02–1.05 — better than any other approach. However, it requires purpose-built facilities, is incompatible with most standard server hardware out of the box, and has significant fluid cost and recycling requirements.

Best for: Purpose-built AI training facilities, hyperscale clusters with 10+ year horizons

Who Is Winning the AI Data Center Cooling Market

The market is splitting into public infrastructure plays and private startups. On the public side, Vertiv (VRT) is the clearest beneficiary: the company reported $7.35B in 2024 revenue with AI-driven thermal management as a core growth driver. Their stock is up over 300% since early 2023, and management has guided to 15–20% organic growth through 2027 as liquid cooling capacity scales.

nVent Electric (NVT) competes in enclosures and thermal management and has a smaller but growing AI exposure. Schneider Electric and Eaton have data center cooling divisions, though AI is a smaller share of their total revenue.

In the private market, several companies are worth watching:

CoolIT Systems

OEM partnerships with Dell, HPE, and Lenovo; DLC leader for enterprise GPU servers

Asetek

HPC and cloud cooling; publicly traded in Norway; growing AI contract backlog

GRC (Green Revolution Cooling)

Single-phase immersion for hyperscale; raised $65M Series C from 40North

LiquidStack

Two-phase immersion; partnerships with Intel and AMD; Shell-backed

Submer

European immersion cooling startup; growing in EMEA hyperscale

ZutaCore

Waterless direct liquid cooling; unique for water-constrained markets

The Water Problem Nobody Wants to Talk About

Liquid cooling shifts the heat problem — it doesn't eliminate it. Water-cooled facilities ultimately reject heat into cooling towers, which evaporate water into the atmosphere. A 100MW data center using evaporative cooling can consume 3–5 million gallons of water per day. At scale, Microsoft's global data center fleet consumes more water than some medium-sized cities.

This is creating a secondary market in water-efficient cooling: dry cooling (air-cooled chillers that use no water), adiabatic cooling (evaporative with smart controls that minimize water use), and closed-loop geothermal systems. Arizona and Nevada — major data center markets — are already water-stressed, pushing hyperscalers toward facilities in cooler, water-abundant regions or toward waterless cooling systems.

ZutaCore's waterless DLC approach addresses this directly. Their system uses a refrigerant that evaporates inside the cold plate and condenses externally, removing heat without any water consumption. It's a niche but growing wedge in markets where water permits are becoming as scarce as power permits.

The Investment Thesis: Where the Real Money Is

The GPU spending is concentrated at the hyperscaler level — Nvidia wins most of that. But cooling is fragmented across component suppliers, OEM integrators, facility contractors, and facility operators. That fragmentation creates multiple entry points for capital.

The most durable positions are in companies that have OEM integration agreements with server manufacturers. Once CoolIT or Asetek systems are designed into Dell or HPE server lines, every AI server sale pulls through a cooling component. That's recurring revenue with switching costs — the best kind of infrastructure business model.

Track the AI buildout spending and the players benefiting from it on the AI Spending Dashboard and the AI Buildout Tracker at Value Add VC.

Highest conviction

OEM-integrated DLC suppliers with NVIDIA/Dell/HPE design wins — sticky, recurring

High growth

Immersion cooling operators targeting greenfield hyperscale AI campuses — lumpy but large

Hedge play

Public infra (Vertiv, nVent) — liquid on the GPU arms race without single-name model risk

Every dollar spent on AI compute creates a proportional spend on thermal infrastructure.

The $300B AI capex supercycle has a $50B cooling subsector attached to it — and most investors haven't noticed yet.

Track AI infrastructure spending and the companies building it on the AI Spending Dashboard at Value Add VC. Originally published in the Trace Cohen newsletter.

Frequently Asked Questions

What is ai data center cooling technology and why does it matter for AI?

AI data center cooling technology refers to systems that manage the heat generated by GPU clusters training and running large language models. NVIDIA H100 GPUs produce 700W each — a single rack of 8 GPUs generates 5.6kW+, and full DGX H100 systems hit 10.2kW per rack. Traditional air cooling tops out around 20–30kW per rack. At the density AI requires, without liquid cooling the hardware simply throttles or fails.

What are the main types of liquid cooling for AI data centers?

The three dominant approaches are: direct liquid cooling (DLC), which runs coolant through cold plates attached directly to chips; rear-door heat exchangers (RDHx), which cool air before it exits the rack; and immersion cooling, which submerges entire servers in dielectric fluid. DLC is the fastest-growing given compatibility with existing infrastructure and NVIDIA's own NVLink and NVSwitch hardware designs.

Who are the major companies in AI data center cooling?

Vertiv (VRT) is the publicly traded leader with $7B+ in 2024 revenue and significant liquid cooling exposure. nVent (NVT) focuses on thermal management enclosures. Private companies include Asetek (liquid cooling for HPC), CoolIT Systems (OEM partnerships with Dell and HPE), and GRC (immersion cooling). Hyperscalers like Google and Meta are also building proprietary liquid cooling systems in-house.

How much does liquid cooling cost versus air cooling for data centers?

Liquid cooling systems cost 2–4x more upfront than equivalent air cooling capacity. However, liquid cooling delivers 30–50% lower energy consumption and enables rack densities 3–5x higher — meaning fewer total racks, less physical space, and lower long-term power bills. For AI workloads where GPUs run at 70–100% utilization continuously, the TCO math favors liquid cooling within 18–24 months.

What is PUE and how does cooling affect AI data center efficiency?

Power Usage Effectiveness (PUE) is total facility power divided by IT equipment power. A PUE of 1.0 is theoretical perfection — every watt goes to compute. The industry average is 1.58; hyperscalers average 1.10–1.15 with advanced cooling. For AI-specific facilities, Google reports a 1.08 average PUE across their fleet. Every 0.1 improvement in PUE at 100MW of compute saves roughly $5–10M per year in energy costs.

Explore 45+ free VC tools, dashboards, and recommended startup software.