OpenAI Unveils 'Jalapeño,' Its First Custom AI Chip, Built With Broadcom for Inference at Scale

OpenAI revealed Jalapeño, its first in-house silicon -- a chip designed with Broadcom and purpose-built for running AI models (inference) rather than training them. OpenAI says early results show 'significantly better performance-per-watt' than current state-of-the-art alternatives, marking its most concrete step yet to reduce a near-total dependence on Nvidia GPUs.

Jalapeño

Chip Name

Broadcom

Built With

Inference only

Purpose

Better perf-per-watt

Edge

Oct 2025

Partnership Announced

Trace Cohen

Early-stage VC & angel · Founder, New York Venture Partners

June 24, 2026

3 min read

OpenAI has unveiled Jalapeño, its first custom AI chip, designed in partnership with Broadcom and built specifically to run AI models at scale -- the inference workloads that power ChatGPT and its API -- rather than to train them. According to OpenAI, early silicon is already showing 'significantly better performance-per-watt than current state-of-the-art alternatives,' the company's most tangible move yet to lessen its reliance on Nvidia's GPUs.

The chip is the product of a partnership Broadcom and OpenAI first announced in October 2025, when the two committed to co-developing custom accelerators. OpenAI president Greg Brockman framed the effort as workload-specific: 'We have a deep understanding of the workload. We've really been looking for specific workloads that are underserved, [and asking] how can we build something that will be able to accelerate what's possible?' Notably, OpenAI says its own models helped design the chip -- AI building the hardware that will run AI.

The strategic logic is about economics, not bragging rights. Training a frontier model is a periodic, capital-intensive event; inference is the relentless, every-query cost that scales with usage and never stops. By owning the inference silicon, OpenAI attacks the single largest line in its long-run compute budget -- and the one most exposed to Nvidia's pricing power. Intensive pre-training is expected to keep running on Nvidia for now.

“The chip is the product of a partnership Broadcom and OpenAI first announced in October 2025, when the two committed to co-developing custom accelerators.”

OpenAI is following a path Google and Amazon blazed years ago. Google's TPUs and Amazon's Trainium/Inferentia chips, also built with Broadcom and Marvell as ASIC partners, proved that hyperscalers can design narrow, efficient accelerators that beat general-purpose GPUs on cost-per-token for their own workloads. Meta's MTIA and Microsoft's Maia program are the same bet. Jalapeño puts the most-watched AI company in the world firmly into that club.

The competitive ripples are immediate. Broadcom has emerged as the quiet kingmaker of custom AI silicon, and a marquee OpenAI chip cements the thesis that has driven its stock to historic highs. For Nvidia, every hyperscaler ASIC chips away at the most lucrative corner of its franchise -- not in training, where its lead is durable, but in high-volume inference, where 'good enough and far cheaper' is a winning pitch. The same week, Qualcomm agreed to buy Modular for ~$3.9B to attack Nvidia's CUDA software moat, underscoring a coordinated, industry-wide assault on Nvidia's dominance.

For founders and operators, the message is that compute is becoming a stack to be optimized, not a single vendor to be obeyed. If OpenAI can shave double-digit percentages off inference cost-per-token, it widens its margin to subsidize cheaper API pricing and squeeze rivals who still pay full Nvidia freight. That dynamic flows straight to anyone building on top of these APIs.

The bear case is real: custom silicon is brutally hard, schedules slip, and 'in testing' is not 'in production.' Google spent the better part of a decade maturing TPUs; OpenAI is attempting this while also building data centers, raising tens of billions, and racing to an IPO. Yield, packaging, and the software stack to actually use the chip are where ASIC dreams go to die.

What to watch: a production timeline and volume, whether OpenAI discloses real cost-per-token improvements, how much Nvidia capacity it can actually displace, and whether Jalapeño stays internal or -- like TPUs -- eventually gets rented to others. If it ships at scale, it reprices the entire inference market.

OpenAI Unveils 'Jalapeño,' Its First Custom AI Chip, Built With Broadcom for Inference at Scale

Jalapeño

Chip Name

Broadcom

Built With

Inference only

Purpose

Better perf-per-watt

Edge

Oct 2025

Partnership Announced

Trace Cohen

Early-stage VC & angel · Founder, New York Venture Partners

June 24, 2026

3 min read

“The chip is the product of a partnership Broadcom and OpenAI first announced in October 2025, when the two committed to co-developing custom accelerators.”

OpenAI Unveils 'Jalapeño,' Its First Custom AI Chip, Built With Broadcom for Inference at Scale

Markets Now

Read Next

Mistral Launches OCR 4, Turning Document Extraction Into a Full Enterprise AI Play

Alibaba's Model Never Trained as an Agent -- Yet Beat Agent Benchmarks Across Seven Tests

Xiaomi's HarnessX Rewrites Its Own AI Scaffolding Mid-Task -- and Smaller Models Gain the Most

OpenAI Unveils 'Jalapeño,' Its First Custom AI Chip, Built With Broadcom for Inference at Scale

Markets Now

Read Next

Mistral Launches OCR 4, Turning Document Extraction Into a Full Enterprise AI Play

Alibaba's Model Never Trained as an Agent -- Yet Beat Agent Benchmarks Across Seven Tests

Xiaomi's HarnessX Rewrites Its Own AI Scaffolding Mid-Task -- and Smaller Models Gain the Most