VC
Value Add VC
⚡HomePulse⚡Helpful Apps📝Blog
Home/Blog/Baseten's annualized revenue run-rate hit roughly $600M by March 2026 — up around 1,900% year-over-year — and its valuation jumped to $13B in June, up from just $5B five months earlier. Inference is becoming its own infrastructure war, and Baseten is one of the clearest pure-play bets on it.
AI & TechnologyJune 30, 2026·9 min read·Last updated: June 30, 2026

Baseten's annualized revenue run-rate hit roughly $600M by March 2026 — up around 1,900% year-over-year — and its valuation jumped to $13B in June, up from just $5B five months earlier. Inference is becoming its own infrastructure war, and Baseten is one of the clearest pure-play bets on it.

Baseten's annualized revenue run-rate hit roughly $600M by March 2026 — up around 1,900% year-over-year — and its valuation jumped to $13B in June, up from just $5B five months earlier. Inference is becoming its own infrastructure war, and Baseten is one of the clearest pure-play bets on it.

TC
Trace Cohen
Co-Founder & GP at Six Point Ventures · 3x founder (BrandYourself, Launch.it, SPOT) · 65+ investments · Based in Boca Raton, FL
@Trace_Cohen·t@nyvp.com·South Florida Advisory

Quick Answer

Baseten is valued at $13B after a $1.5B Series F in June 2026 (raised in two tranches at $11B and $13B), up from just $5B at its $300M Series E in January 2026. Its annualized revenue run-rate reached roughly $600M by March 2026 — up about 1,900% year-over-year from $200M in December 2025 — and the platform now processes more than 1 billion inference calls a day across 87 clusters and 18 clouds. Baseten runs a usage-based AI inference platform: customers pay for API consumption on open-source models or for GPU minutes, and Baseten handles autoscaling, observability, billing, and tooling. The thesis is that inference is becoming its own infrastructure war as workloads shift to open-source and custom models.

Baseten is valued at $13B after a $1.5B Series F in June 2026, and its annualized revenue run-rate hit roughly $600M by March 2026 — up around 1,900% year-over-year, with the platform now processing more than 1 billion inference calls a day.

That's the short answer. The longer one is a story about where the value in AI is moving. Training got the headlines and the capex. But as workloads shift to open-source and custom models, the recurring spend lands on inference — actually running those models in production, reliably, at scale. Baseten is one of the cleanest pure-play bets on that shift, and the numbers show it.

Baseten ARR and Valuation 2026: The Headline Numbers

The numbers are extreme even by 2026 AI standards. Baseten's annualized revenue run-rate went from $200M in December 2025 to roughly $600M by March 2026 — about a 1,900% year-over-year increase. The valuation moved just as fast: $5B in January, $13B by June. Compare the trajectory against peers on the AI Valuations dashboard.

Dec 2025

$200M

run-rate

Mar 2026

~$600M

+1,900% YoY

Jan 2026

$5B

Series E val.

Jun 2026

$13B

Series F val.

The operational scale behind the revenue: Baseten processes more than 1 billion inference calls a day across 87 clusters and 18 clouds. That multi-cloud, multi-cluster footprint is the moat — it's what lets the platform route workloads to available capacity and stay up when any single provider gets tight. Full detail on the Baseten company profile.

Baseten Funding History: $5B to $13B in Five Months

Baseten's recent funding cadence is a case study in how fast 2026's AI infrastructure market reprices a winner. In January 2026, the $300M Series E set a $5B valuation, led by IVP, CapitalG, and Nvidia — the Nvidia participation a notable signal given that Baseten is fundamentally a buyer and orchestrator of GPU compute.

Just five months later, the $1.5B Series F in June 2026 lifted the valuation to $13B, led by Altimeter, Conviction, and Spark. The round was structured in two tranches priced at $11B and $13B — a sign of how aggressively investors were chasing the revenue curve as it kept compounding mid-round.

A 2.6x valuation step in five months is only defensible if revenue moves with it — and here it did, with the run-rate roughly tripling over a similar window. Put against the broader AI infrastructure spend, Baseten is a direct beneficiary of every dollar flowing into production AI workloads.

How Baseten Makes Money: Usage-Based Inference

Baseten runs a usage-based AI inference platform. Customers pay for what they consume — API calls against open-source models, or GPU minutes and hours for custom and self-hosted models. Baseten's job is to make that consumption painless: it packages autoscaling, observability, billing, and developer tooling so teams can ship a model to production without building their own serving infrastructure.

  • Consumption pricing: revenue scales directly with inference volume, so growth tracks customer usage.
  • Autoscaling: capacity flexes with traffic across 87 clusters and 18 clouds.
  • Observability and billing: the operational plumbing that makes production inference manageable.

The model's appeal is that it converts a brutal infrastructure problem — running models fast, cheaply, and reliably at scale — into a metered API. The risk, as with any consumption business, is that revenue is only as durable as the underlying usage.

The Bull Case: Inference Is Its Own Infrastructure War

The core thesis is simple: training is a one-time, lumpy cost, but inference is the recurring bill — and the bill grows every time a model gets used. As AI workloads shift toward open-source and custom models that companies want to run themselves rather than calling a closed API, the demand for a neutral, high-performance serving layer compounds. Baseten sits exactly there.

The multi-cloud footprint — 18 clouds, 87 clusters — is the structural advantage. It lets Baseten arbitrage GPU availability and price across providers, deliver reliability no single cloud can match, and avoid lock-in to any one hyperscaler. If inference becomes its own infrastructure category the way databases or CDNs did, the company that owns the neutral serving layer captures enormous recurring spend. That's the same dynamic reshaping the entire AI compute stack.

The Bear Case: Hyperscalers, Margins, and Consumption Risk

The obvious risk is the hyperscalers. AWS Bedrock, Azure, and Google Cloud all want to own inference, and they own the underlying GPUs Baseten rents. If they bundle competitive inference into the cloud commitments enterprises already sign, Baseten has to keep being faster, cheaper, or more reliable to justify a separate vendor. That is a perpetual race against companies with effectively unlimited balance sheets.

The second risk is margin and consumption volatility. Because Baseten resells compute it buys, gross margins depend on how well it optimizes utilization — and a usage-based business is fully exposed to customer concentration and spend pullbacks. A run-rate that grew 1,900% on the way up can also decelerate quickly if a few large customers shift workloads or a budget cycle tightens.

Baseten vs Together AI, Fireworks, Modal, and the Hyperscalers

The inference market is crowded with credible players attacking it from different angles.

CompetitorAngleBaseten's Position
Together AIOpen-model inference + trainingProduction reliability across 18 clouds
Fireworks AIFast, low-cost model servingMulti-cluster autoscaling at 1B+ calls/day
Modal / ReplicateDeveloper-friendly computeEnterprise-grade observability + billing
AWS / Azure / GCPBundled cloud inferenceNeutral, cross-cloud serving layer
Nvidia stackOwns the silicon + softwareOrchestration above the hardware layer

Baseten's differentiation is operational reliability and cross-cloud reach rather than the lowest sticker price on any single model.

What to Watch: Baseten's IPO Odds and the Key Metrics

With a fresh $1.5B Series F, Baseten is fully funded and unlikely to need public markets soon — a realistic IPO window is 2027 or later, and only if the run-rate sustains and margins prove out. Track the timeline against the rest of the field on the AI IPO Pipeline.

Three metrics decide the story: gross margin (can Baseten profitably resell compute it doesn't own), customer concentration (how much of the run-rate sits with a handful of large accounts), and net retention (does inference spend expand inside existing customers). If margins hold and the base diversifies, the $13B mark looks early. If a hyperscaler undercuts it or a few whales pull back, the multiple is exposed.

The single most important Baseten number isn't the $600M run-rate or the $13B valuation.

It's the gross margin on every inference call it resells.

If Baseten can run a neutral, cross-cloud serving layer at durable margins while inference demand compounds, it owns an entire infrastructure category. If the hyperscalers compress that margin to near zero, hypergrowth revenue still doesn't make a business. That's the whole bet.

Explore Related Dashboards

Interactive tools with live data on this topic

📋
AI IPO Pipeline
Every AI company likely to IPO next
💸
AI Spending Tracker
Big tech AI capex spending in real time

Track Baseten and the rest of the AI infrastructure landscape — revenue, valuations, and IPO odds — on the AI Valuations Dashboard and the AI Spending Dashboard at Value Add VC. Originally published in the Trace Cohen newsletter.

ShareXLinkedInEmail

Frequently Asked Questions

What is Baseten's ARR in 2026?

Baseten reached a roughly $600M annualized revenue run-rate by March 2026, up about 1,900% year-over-year from $200M in December 2025. That run-rate is driven by usage-based consumption — customers paying for inference API calls and GPU time — so it scales directly with how much AI compute its customers run through the platform.

What is Baseten's valuation?

Baseten is valued at $13B after a $1.5B Series F in June 2026, led by Altimeter, Conviction, and Spark. The round was structured in two tranches priced at $11B and $13B. That is up from $5B just five months earlier, when its $300M Series E in January 2026 was led by IVP, CapitalG, and Nvidia.

How does Baseten make money?

Baseten runs a usage-based AI inference platform. Customers pay for API consumption on open-source models or for GPU minutes and hours, and Baseten packages the hard parts around it — autoscaling, observability, billing, and developer tooling. Revenue scales with consumption, so as customers push more inference workloads through Baseten, revenue grows with them.

Who are Baseten's competitors?

Baseten competes with other inference specialists like Together AI, Fireworks AI, Modal, and Replicate, as well as the hyperscalers' own inference offerings — AWS Bedrock, Azure, and Google Cloud — and Nvidia's own software stack. Its pitch is a faster, more reliable, multi-cloud inference layer optimized for open-source and custom models.

Related Tools & Dashboards

🤖AI Valuations💸AI Spending📋AI IPO Pipeline

Keep Reading

🔍Glean Valuation 2026: $7.2B Enterprise AI Search⚖️Harvey AI Valuation 2026: Legal AI at $11B

Explore 45+ free VC tools, dashboards, and recommended startup software.

Explore DashboardsHelpful Apps & Platforms

Trace Cohen is a serial founder, investor and data geek. Please feel free to reach out t@nyvp.com

VC
Value Add VC
Helpful AppsTwitterContact