Groq Raises $650M to Scale Its AI Inference Cloud and LPU Chips

AI chip and inference-cloud company Groq raised $650 million in a round led by Infinitum and Disruptive, fueling its bid to be the fast, low-cost alternative for running AI models. Founded by a former Google TPU architect, Groq builds its own Language Processing Unit silicon and sells inference as a service -- a vertically integrated play against both Nvidia and cloud incumbents.

$650M

Raised

Infinitum, Disruptive

Lead

LPU chips + inference cloud

Product

Ex-Google TPU architect

Founder Pedigree

Nvidia, cloud incumbents

Targets

Trace Cohen

Early-stage VC & angel · Founder, New York Venture Partners

June 26, 2026

2 min read

Groq has raised $650 million in a financing led by Infinitum and Disruptive, the company's latest infusion as it scales an AI inference business built on its own custom silicon. Groq designs Language Processing Units (LPUs), chips purpose-built for running -- not training -- AI models with very low latency, and pairs them with a cloud service that sells that speed directly to developers and enterprises.

The company's pitch is vertical integration: by owning both the chip and the cloud that runs it, Groq argues it can deliver tokens faster and cheaper than competitors stitching together Nvidia GPUs. Founded by Jonathan Ross, a former architect of Google's TPU, Groq has leaned into inference specifically -- the high-volume, latency-sensitive workload that is becoming the dominant share of AI compute as applications move from demos into production.

“It came the same week Baseten closed a $1.5 billion round at a $13 billion valuation, making two large inference bets in days.”

The raise lands amid feverish investor appetite for the inference layer. It came the same week Baseten closed a $1.5 billion round at a $13 billion valuation, making two large inference bets in days. Together they signal that capital sees the serving of AI models -- regardless of which lab's model wins -- as a structurally growing market. Groq competes with Nvidia's GPU stack, the hyperscalers' inference offerings, and software-layer players like Together AI and Fireworks.

The broader context is the custom-silicon wave reshaping AI hardware. As OpenAI, Google, Amazon and others build their own accelerators to escape Nvidia's margins, Groq is the merchant version of that thesis -- offering specialized inference silicon to anyone, not just hyperscalers with the scale to design their own. Its edge has to be raw performance-per-dollar on real workloads.

The bear case is formidable competition and capital intensity: building chips and a cloud simultaneously is enormously expensive, and Groq must out-execute both Nvidia's ecosystem and well-funded software rivals. What to watch: Groq's deployed capacity and customer wins, independent benchmarks of its LPU performance versus GPUs, and whether the inference market consolidates around a few platforms or stays fragmented enough for a specialist to thrive.

“It came the same week Baseten closed a $1.5 billion round at a $13 billion valuation, making two large inference bets in days.”

Groq Raises $650M to Scale Its AI Inference Cloud and LPU Chips

Markets Now

Read Next

Baseten Raises $1.5B Series F at a $13B Valuation as AI Inference Demand Explodes

General Intuition Raises $320M Series A at $2.3B to Build AI Models From Gameplay

Patronus AI Lands $50M to Build Digital Worlds That Stress-Test AI Agents

Groq Raises $650M to Scale Its AI Inference Cloud and LPU Chips

Markets Now

Read Next

Baseten Raises $1.5B Series F at a $13B Valuation as AI Inference Demand Explodes

General Intuition Raises $320M Series A at $2.3B to Build AI Models From Gameplay

Patronus AI Lands $50M to Build Digital Worlds That Stress-Test AI Agents