VC
Value Add VC
⚡HomePulse⚡Helpful Apps📝Blog
← Value Add PulseAIFirst custom inference chip

OpenAI Unveils 'Jalapeño,' Its First Custom AI Chip, Built With Broadcom for Inference at Scale

OpenAI revealed Jalapeño, its first in-house silicon -- a chip designed with Broadcom and purpose-built for running AI models (inference) rather than training them. OpenAI says early results show 'significantly better performance-per-watt' than current state-of-the-art alternatives, marking its most concrete step yet to reduce a near-total dependence on Nvidia GPUs.

Jalapeño
Chip Name
Broadcom
Built With
Inference only
Purpose
Better perf-per-watt
Edge
Oct 2025
Partnership Announced
TC
Trace Cohen
Early-stage VC & angel · Founder, New York Venture Partners
June 24, 2026
3 min read
KEY TAKEAWAYS FOR VCs & FOUNDERS
1

OpenAI building its own inference chip is the clearest signal yet that the largest AI buyers want off the Nvidia tax

2

Inference -- not training -- is where the recurring, at-scale compute bill lives, so owning that silicon compounds

3

Performance-per-watt is the metric that matters as power, not chips, becomes the binding constraint on AI

4

A credible custom-silicon path from OpenAI pressures Nvidia's margins and validates the Broadcom-ASIC playbook

TC
The VC Read · Trace's TakeTrace Cohen

This is the most important number in AI that nobody quotes: inference cost-per-token, and OpenAI just declared war on it. Training is a vanity event; inference is the bill that scales with every user forever, and owning that silicon is how you turn a gross-margin problem into a gross-margin moat. The smartest tell is that OpenAI used its own models to help design the chip -- the flywheel where AI improves the hardware that runs AI is the real story under the story. For founders building on these APIs, cheaper inference flows straight to you; for Nvidia bulls, the question is no longer if the biggest buyers defect on inference, but how fast. Watch for a production date and a real cost-per-token number -- everything else is a press release.

⚡ AI Chip Wars →🤖 AI Landscape →AI Agent Economy →

OpenAI has unveiled Jalapeño, its first custom AI chip, designed in partnership with Broadcom and built specifically to run AI models at scale -- the inference workloads that power ChatGPT and its API -- rather than to train them. According to OpenAI, early silicon is already showing 'significantly better performance-per-watt than current state-of-the-art alternatives,' the company's most tangible move yet to lessen its reliance on Nvidia's GPUs.

The chip is the product of a partnership Broadcom and OpenAI first announced in October 2025, when the two committed to co-developing custom accelerators. OpenAI president Greg Brockman framed the effort as workload-specific: 'We have a deep understanding of the workload. We've really been looking for specific workloads that are underserved, [and asking] how can we build something that will be able to accelerate what's possible?' Notably, OpenAI says its own models helped design the chip -- AI building the hardware that will run AI.

The strategic logic is about economics, not bragging rights. Training a frontier model is a periodic, capital-intensive event; inference is the relentless, every-query cost that scales with usage and never stops. By owning the inference silicon, OpenAI attacks the single largest line in its long-run compute budget -- and the one most exposed to Nvidia's pricing power. Intensive pre-training is expected to keep running on Nvidia for now.

“The chip is the product of a partnership Broadcom and OpenAI first announced in October 2025, when the two committed to co-developing custom accelerators.”

OpenAI is following a path Google and Amazon blazed years ago. Google's TPUs and Amazon's Trainium/Inferentia chips, also built with Broadcom and Marvell as ASIC partners, proved that hyperscalers can design narrow, efficient accelerators that beat general-purpose GPUs on cost-per-token for their own workloads. Meta's MTIA and Microsoft's Maia program are the same bet. Jalapeño puts the most-watched AI company in the world firmly into that club.

The competitive ripples are immediate. Broadcom has emerged as the quiet kingmaker of custom AI silicon, and a marquee OpenAI chip cements the thesis that has driven its stock to historic highs. For Nvidia, every hyperscaler ASIC chips away at the most lucrative corner of its franchise -- not in training, where its lead is durable, but in high-volume inference, where 'good enough and far cheaper' is a winning pitch. The same week, Qualcomm agreed to buy Modular for ~$3.9B to attack Nvidia's CUDA software moat, underscoring a coordinated, industry-wide assault on Nvidia's dominance.

For founders and operators, the message is that compute is becoming a stack to be optimized, not a single vendor to be obeyed. If OpenAI can shave double-digit percentages off inference cost-per-token, it widens its margin to subsidize cheaper API pricing and squeeze rivals who still pay full Nvidia freight. That dynamic flows straight to anyone building on top of these APIs.

The bear case is real: custom silicon is brutally hard, schedules slip, and 'in testing' is not 'in production.' Google spent the better part of a decade maturing TPUs; OpenAI is attempting this while also building data centers, raising tens of billions, and racing to an IPO. Yield, packaging, and the software stack to actually use the chip are where ASIC dreams go to die.

What to watch: a production timeline and volume, whether OpenAI discloses real cost-per-token improvements, how much Nvidia capacity it can actually displace, and whether Jalapeño stays internal or -- like TPUs -- eventually gets rented to others. If it ships at scale, it reprices the entire inference market.

ShareXLinkedInEmail
More onOpenAI →Nvidia →

Originally reported by Ars Technica. Analysis and editorial commentary by Value Add Pulse.

← Back to Pulse

Markets Now

live
SPCX▲+0.31%
$227.10
CBRS▼-20.20%
$256.80
SPY▲+0.08%
5,942.70
QQQ▲+0.11%
20,012.30
NVDA▼-1.42%
$152.90
MSFT▲+0.25%
$479.80
GOOGL▲+0.34%
$209.10
META▲+0.34%
$651.40

Read Next

AI

Mistral Launches OCR 4, Turning Document Extraction Into a Full Enterprise AI Play

France's Mistral released OCR 4, an upgraded document-understanding model that pushes beyond plain text extraction into a full enterprise data-extraction stack. The move positions Europe's leading AI lab to compete directly with Google, AWS and Azure for the unglamorous but enormous market of turning documents into structured, machine-usable data.

AI

Alibaba's Model Never Trained as an Agent -- Yet Beat Agent Benchmarks Across Seven Tests

Alibaba researchers showed a model that was never explicitly trained for agentic tasks but still improved agent performance across seven benchmarks. The result challenges the assumption that strong agentic behavior requires dedicated, expensive agent-specific training -- a potentially significant efficiency unlock.

AI

Xiaomi's HarnessX Rewrites Its Own AI Scaffolding Mid-Task -- and Smaller Models Gain the Most

Xiaomi unveiled HarnessX, a system that lets an AI agent rewrite its own scaffolding -- the prompts, tools and control logic around the model -- in the middle of a task. The standout finding: smaller, cheaper models benefit the most, suggesting clever orchestration can substitute for raw model size.

@Trace_Cohen·t@nyvp.com