In early 2023, GPT-4 input tokens cost $30 per million. By late 2025, GPT-4o had fallen to under $1.25. That's a 95%+ price collapse in under 30 months — and it's not slowing down.
The commoditization of foundation models is real and irreversible. But the conclusion most people draw from it — that AI application companies are therefore worthless wrappers — is one of the most persistent wrong takes in tech investing right now.
The Commoditization Numbers Are Real
The race to the bottom on inference pricing is accelerating from multiple directions simultaneously:
GPT-4 → GPT-4o input pricing
~$30/M → ~$1.25/M tokens
Claude Sonnet 3 → Claude Haiku 3.5 cost/task (typical)
~$0.015 → ~$0.001 per call
Llama 3.1 70B vs GPT-4 quality gap
Benchmark parity in most enterprise tasks
Google Gemini Flash 1.5 → 2.0
4× faster, 50% cheaper, better performance
Open-source models from Meta, Mistral, and Alibaba are closing the quality gap with frontier proprietary models on most enterprise-relevant tasks. The infrastructure layer is commoditizing — that part is not in dispute.
Why "Wrapper" Became a Slur — and Why It's Misapplied
The wrapper dismissal emerged from a legitimate concern: in 2022 and 2023, hundreds of companies launched with thin ChatGPT integrations, no proprietary data, and no workflow depth. A prompt and a UI are not a business. That critique was correct for those companies.
But the same label got applied carelessly to companies that are doing something entirely different — building vertical-specific AI systems that use foundation models as a component, not as the product. The distinction matters enormously:
A thin wrapper
Passes user input to GPT, returns output. No proprietary data. No workflow integration. Replaceable in a weekend.
A vertical AI system
Fine-tuned on domain-specific data. Embedded in existing workflows. Owns the customer relationship and context layer.
What gets commoditized
The model itself — the reasoning engine under the hood that anyone can now access for pennies.
What does not get commoditized
The proprietary training data, the trust inside a regulated vertical, the switching cost of a deeply integrated workflow.
The Companies Proving the Point
The most valuable AI companies of 2026 are all technically "wrappers" — they call foundation model APIs. But they are worth billions because of what they wrapped around the model:
Harvey
$3B+Legal AI
Proprietary legal case data, BigLaw partnerships, compliance-grade audit trails, and 3+ years of fine-tuning on legal reasoning tasks that no public dataset covers.
Ambience Healthcare
$1B+Clinical documentation AI
Real-time ambient recording trained on millions of clinical encounters, integrated into Epic EHR workflows, with HIPAA compliance and hospital network trust built over years.
Glean
$4.6BEnterprise search and knowledge AI
Deep integrations with 100+ enterprise SaaS tools, org-level knowledge graph, and permissions infrastructure that takes 6–12 months per enterprise to deploy and cannot be ripped out cheaply.
Observe.AI
$850M+Contact center intelligence
Training corpus of hundreds of millions of real customer service calls, QA workflow automation embedded in agent workflows, and a coaching layer that improves performance over time.
The Four Layers That Actually Create Value
From 65+ investments across AI-native and AI-enabled companies, the durable value in application-layer AI concentrates in four places — none of which involve building a better model:
- 1
Proprietary training data
Not just scale — domain-specific data that no public dataset contains. Clinical encounter recordings. Legal case outcomes with attorney annotations. Manufacturing defect images from real production lines. This data took years to generate and cannot be bought or scraped.
- 2
Workflow ownership
AI that lives inside a customer's existing workflow — not as a tab they switch to, but as a layer embedded in the tools they already use. The more deeply integrated, the higher the switching cost. This is distribution advantage dressed up as a technical feature.
- 3
Trust and compliance infrastructure
In healthcare, legal, finance, and government, the approval to deploy AI in sensitive workflows takes years of relationship-building, audits, and compliance investment. That trust is a moat that a cheaper model cannot erode overnight.
- 4
The compounding context layer
Systems that get smarter with every customer interaction — because they store, index, and learn from domain-specific usage patterns. An AI that has processed 10 million clinical notes is qualitatively different from one that has processed zero, even if both call the same underlying model.
What This Means for Founders and Investors
The commoditization of foundation models is the best thing that has happened to application-layer startups. Here's why:
Margins expand as inference costs fall
If you charge $50k/year per enterprise seat and your underlying inference costs dropped 90%, your gross margin just went from 60% to 90%. The value you deliver didn't change — your cost structure did.
Customer adoption accelerates
Enterprises were skeptical about AI ROI when a pilot cost $500k in compute. At current inference pricing, a meaningful proof of concept costs under $5k. The sales cycle shortens dramatically.
Open-source doesn't hurt you — it helps you
Running Llama 3.1 70B on your own infrastructure instead of paying OpenAI improves unit economics and lets you fine-tune on proprietary data without sending it to a third party. Open-source is a cost advantage for the application layer.
The real competition is workflow, not model
Your competitor is not OpenAI. It is the other vertical AI startup going after the same buyer with the same workflow problem. That fight is won on distribution, trust, data depth, and product — not on model quality.
The model is the commodity. The wrapper is the business.
Companies that own the data, the workflow, and the customer trust are not wrappers — they are the next generation of enterprise software.
Explore AI company valuations and the application-layer landscape on the AI Valuations Dashboard at Value Add VC. Originally published in the Trace Cohen newsletter.