There are now over 4,000 AI agent startups globally. Fewer than 50 will build something that actually survives the next foundation model release cycle.
The reason is simple: most of them are building products on top of infrastructure they don't own. When the infrastructure shifts — and it will, on a 6-to-12 month cycle — they shift with it. Down.
The Agent Boom Is Real. The Product Framing Is Wrong.
In 2025, enterprise spending on AI agents crossed $12B according to IDC estimates. That number is expected to reach $47B by 2028. The demand is genuine — companies want autonomous systems that can handle customer onboarding, contract review, financial reconciliation, and support triaging without human operators in the loop.
But here's what most founders miss: enterprises are not buying "agents." They are buying reliable workflow execution. The agent is the interface. The reliability layer is the product.
Think of it like TCP/IP versus email apps. Email is the interface users see. TCP/IP is the infrastructure that makes it work at scale. Nobody pays for TCP/IP directly — but every email business depends on it completely. In agentic AI, the orchestration layer is TCP/IP. The "AI agent for legal review" is just the email app.
What Infrastructure Actually Means in the Agentic Era
When I talk to enterprise buyers — and I talk to a lot of them across my 65+ portfolio companies — the questions they ask about agent deployments are never "which LLM does it use?" The questions are:
"What happens when the agent fails mid-task?"
Enterprise workflows can't tolerate silent failures or half-executed operations.
"How does it handle context across a 30-step process?"
Most agent demos work on 3-step tasks. Real enterprise workflows are 30+ steps spanning hours.
"Where is the audit trail?"
Compliance and legal require every action logged, attributable, and reversible.
"How do we put a human back in the loop when confidence is low?"
Fully autonomous agents are a liability without escalation mechanisms.
"How does it integrate with our existing systems?"
Agents need to read and write to ERP, CRM, and custom databases — not just chat interfaces.
None of those questions are answered by the LLM. They are all answered by the infrastructure layer beneath it.
The Stack That Actually Matters
The defensible agentic AI stack has four layers. Most startups only build one of them.
Orchestration
Manages multi-step planning, tool selection, and task decomposition. The brain behind how an agent decides what to do next.
Memory
Stores and retrieves context across long-horizon tasks and sessions. Without persistent memory, every agent conversation starts from zero.
Tool Integration
Pre-built connectors to enterprise systems — Salesforce, SAP, Workday, Jira. Each integration is months of hardening work.
Reliability & Compliance
Audit logs, retry logic, human-in-the-loop gates, PII handling, and SOC 2 compliance. The layer that makes agents enterprise-deployable.
The LLM call itself — the part most "agent startups" are actually building — sits on top of all four of these layers. It is the least defensible part of the stack.
What Investors Are Actually Funding (and What They're Skipping)
I review roughly 20 agent company decks per month. The pattern is consistent: 85% of them lead with the demo, not the infrastructure. They show a 3-minute video of an agent completing a task that took a human 45 minutes. That demo is compelling for 30 days — until three competitors ship the same demo.
The 15% that get funded by serious investors lead with different slides:
- →Their proprietary workflow data — the accumulated context from 10,000+ completed agent tasks in a vertical
- →Their system integrations — 40+ pre-built enterprise connectors that took 18 months to harden
- →Their reliability metrics — 99.2% task completion rate vs 73% industry average
- →Their compliance architecture — the reason a Fortune 500 legal team approved them in 6 weeks
- →Their orchestration engine — a multi-agent coordination system that handles tasks humans take 4 hours to complete
Those are infrastructure assets, not product features. They take years to build and cannot be replicated by pointing GPT-5 at a new prompt template.
What to Build If You Believe This
What Wins Long-Term
- ✓ Vertical orchestration engines with accumulated workflow data
- ✓ Deep system integrations that took years to harden
- ✓ Compliance layers that clear enterprise legal review
- ✓ Memory systems that improve with every completed task
- ✓ Human-in-the-loop frameworks for low-confidence actions
What Gets Commoditized Fast
- ✕ Thin wrappers around a single LLM's function calling
- ✕ Agent demos with no reliability or audit infrastructure
- ✕ Horizontal "build any agent" platforms with no vertical depth
- ✕ Products where the entire moat is a well-crafted system prompt
- ✕ Anything that can be replicated in a weekend by an OpenAI engineer
The agents are not the product.
The infrastructure that makes agents enterprise-deployable, auditable, and reliable is the product. Build the plumbing, not just the faucet.
Tracking agentic AI investment trends at Value Add VC. Originally published in the Trace Cohen newsletter.