Baseten is valued at $13B after a $1.5B Series F in June 2026, and its annualized revenue run-rate hit roughly $600M by March 2026 — up around 1,900% year-over-year, with the platform now processing more than 1 billion inference calls a day.
That's the short answer. The longer one is a story about where the value in AI is moving. Training got the headlines and the capex. But as workloads shift to open-source and custom models, the recurring spend lands on inference — actually running those models in production, reliably, at scale. Baseten is one of the cleanest pure-play bets on that shift, and the numbers show it.
Baseten ARR and Valuation 2026: The Headline Numbers
The numbers are extreme even by 2026 AI standards. Baseten's annualized revenue run-rate went from $200M in December 2025 to roughly $600M by March 2026 — about a 1,900% year-over-year increase. The valuation moved just as fast: $5B in January, $13B by June. Compare the trajectory against peers on the AI Valuations dashboard.
Dec 2025
$200M
run-rate
Mar 2026
~$600M
+1,900% YoY
Jan 2026
$5B
Series E val.
Jun 2026
$13B
Series F val.
The operational scale behind the revenue: Baseten processes more than 1 billion inference calls a day across 87 clusters and 18 clouds. That multi-cloud, multi-cluster footprint is the moat — it's what lets the platform route workloads to available capacity and stay up when any single provider gets tight. Full detail on the Baseten company profile.
Baseten Funding History: $5B to $13B in Five Months
Baseten's recent funding cadence is a case study in how fast 2026's AI infrastructure market reprices a winner. In January 2026, the $300M Series E set a $5B valuation, led by IVP, CapitalG, and Nvidia — the Nvidia participation a notable signal given that Baseten is fundamentally a buyer and orchestrator of GPU compute.
Just five months later, the $1.5B Series F in June 2026 lifted the valuation to $13B, led by Altimeter, Conviction, and Spark. The round was structured in two tranches priced at $11B and $13B — a sign of how aggressively investors were chasing the revenue curve as it kept compounding mid-round.
A 2.6x valuation step in five months is only defensible if revenue moves with it — and here it did, with the run-rate roughly tripling over a similar window. Put against the broader AI infrastructure spend, Baseten is a direct beneficiary of every dollar flowing into production AI workloads.
How Baseten Makes Money: Usage-Based Inference
Baseten runs a usage-based AI inference platform. Customers pay for what they consume — API calls against open-source models, or GPU minutes and hours for custom and self-hosted models. Baseten's job is to make that consumption painless: it packages autoscaling, observability, billing, and developer tooling so teams can ship a model to production without building their own serving infrastructure.
- Consumption pricing: revenue scales directly with inference volume, so growth tracks customer usage.
- Autoscaling: capacity flexes with traffic across 87 clusters and 18 clouds.
- Observability and billing: the operational plumbing that makes production inference manageable.
The model's appeal is that it converts a brutal infrastructure problem — running models fast, cheaply, and reliably at scale — into a metered API. The risk, as with any consumption business, is that revenue is only as durable as the underlying usage.
The Bull Case: Inference Is Its Own Infrastructure War
The core thesis is simple: training is a one-time, lumpy cost, but inference is the recurring bill — and the bill grows every time a model gets used. As AI workloads shift toward open-source and custom models that companies want to run themselves rather than calling a closed API, the demand for a neutral, high-performance serving layer compounds. Baseten sits exactly there.
The multi-cloud footprint — 18 clouds, 87 clusters — is the structural advantage. It lets Baseten arbitrage GPU availability and price across providers, deliver reliability no single cloud can match, and avoid lock-in to any one hyperscaler. If inference becomes its own infrastructure category the way databases or CDNs did, the company that owns the neutral serving layer captures enormous recurring spend. That's the same dynamic reshaping the entire AI compute stack.
The Bear Case: Hyperscalers, Margins, and Consumption Risk
The obvious risk is the hyperscalers. AWS Bedrock, Azure, and Google Cloud all want to own inference, and they own the underlying GPUs Baseten rents. If they bundle competitive inference into the cloud commitments enterprises already sign, Baseten has to keep being faster, cheaper, or more reliable to justify a separate vendor. That is a perpetual race against companies with effectively unlimited balance sheets.
The second risk is margin and consumption volatility. Because Baseten resells compute it buys, gross margins depend on how well it optimizes utilization — and a usage-based business is fully exposed to customer concentration and spend pullbacks. A run-rate that grew 1,900% on the way up can also decelerate quickly if a few large customers shift workloads or a budget cycle tightens.
Baseten vs Together AI, Fireworks, Modal, and the Hyperscalers
The inference market is crowded with credible players attacking it from different angles.
| Competitor | Angle | Baseten's Position |
|---|---|---|
| Together AI | Open-model inference + training | Production reliability across 18 clouds |
| Fireworks AI | Fast, low-cost model serving | Multi-cluster autoscaling at 1B+ calls/day |
| Modal / Replicate | Developer-friendly compute | Enterprise-grade observability + billing |
| AWS / Azure / GCP | Bundled cloud inference | Neutral, cross-cloud serving layer |
| Nvidia stack | Owns the silicon + software | Orchestration above the hardware layer |
Baseten's differentiation is operational reliability and cross-cloud reach rather than the lowest sticker price on any single model.
What to Watch: Baseten's IPO Odds and the Key Metrics
With a fresh $1.5B Series F, Baseten is fully funded and unlikely to need public markets soon — a realistic IPO window is 2027 or later, and only if the run-rate sustains and margins prove out. Track the timeline against the rest of the field on the AI IPO Pipeline.
Three metrics decide the story: gross margin (can Baseten profitably resell compute it doesn't own), customer concentration (how much of the run-rate sits with a handful of large accounts), and net retention (does inference spend expand inside existing customers). If margins hold and the base diversifies, the $13B mark looks early. If a hyperscaler undercuts it or a few whales pull back, the multiple is exposed.
The single most important Baseten number isn't the $600M run-rate or the $13B valuation.
It's the gross margin on every inference call it resells.
If Baseten can run a neutral, cross-cloud serving layer at durable margins while inference demand compounds, it owns an entire infrastructure category. If the hyperscalers compress that margin to near zero, hypergrowth revenue still doesn't make a business. That's the whole bet.
Explore Related Dashboards
Interactive tools with live data on this topic
Track Baseten and the rest of the AI infrastructure landscape — revenue, valuations, and IPO odds — on the AI Valuations Dashboard and the AI Spending Dashboard at Value Add VC. Originally published in the Trace Cohen newsletter.