VC
Value Add VC
⚡HomePulse⚡Helpful Apps📝Blog
Home/Blog/OpenAI Deep Research: The Feature That's Replacing Junior Analyst Work
AI & TechnologyJune 23, 2026·9 min read·Last updated: June 23, 2026

OpenAI Deep Research: The Feature That's Replacing Junior Analyst Work

Launched February 2, 2025, OpenAI Deep Research turns one prompt into a 10-to-30-page cited report in 5 to 30 minutes. It scores 26.6% on Humanity's Last Exam — roughly 8x GPT-4o — and costs $20 to $200 a month. Here's what it does, what it can't, and why it's the first AI feature that replaces a task, not a keystroke.

TC
Trace Cohen
Co-Founder & GP at Six Point Ventures · 3x founder (BrandYourself, Launch.it, SPOT) · 65+ investments · Based in Boca Raton, FL
@Trace_Cohen·t@nyvp.com·South Florida Advisory

Quick Answer

$20 to $200 a month gets you OpenAI Deep Research, an agentic ChatGPT feature launched February 2, 2025 that runs 5 to 30 minutes of autonomous web research and returns a fully cited report. Powered by a version of o3, it scored 26.6% on Humanity's Last Exam — roughly 8x GPT-4o's 3.3% — and compresses hours of junior-analyst work into minutes.

OpenAI Deep Research turns a single prompt into a 10-to-30-page cited report in 5 to 30 minutes — work that used to take a junior analyst a full day or two. That's the short answer. The longer answer is why it's the first AI feature that replaces a task, not just a keystroke.

OpenAI shipped Deep Research on February 2, 2025, and it landed differently from the chatbot updates that came before it. A normal ChatGPT reply answers in seconds from what the model already knows. Deep Research does something closer to what an analyst does: it plans a research approach, opens and reads dozens of web pages, revises its plan as it learns, and then writes a structured report with inline citations. You don't watch it type — you give it a brief and come back to a finished document.

What the OpenAI Deep Research Feature Actually Is

The OpenAI Deep Research feature is an agentic mode inside ChatGPT, launched February 2, 2025, that autonomously browses the web for 5 to 30 minutes, reads dozens of sources, and writes a structured, fully cited report. It runs on a version of OpenAI's o3 reasoning model and is built for multi-step research, not quick chat answers.

The mechanical difference matters. Standard ChatGPT is reactive — one prompt, one answer. Deep Research is a loop: it decomposes your question, searches, reads, notices gaps, searches again, and only stops when it has enough to write. A single run can touch 30 to 100+ web pages and cite dozens of them inline, so you can click through and check the source behind any claim. OpenAI later added a faster, lighter version powered by an o4-mini-class model to handle simpler queries and raise usage limits, but the headline product is the slow, thorough one you delegate real work to.

OpenAI Deep Research vs Gemini, Perplexity, and the Field

"Deep research" became a category in 2025, and by 2026 every major AI lab ships a version. They cluster on the same idea — an agent that researches for minutes and returns a cited report — but differ sharply on depth, speed, and price. Here is how OpenAI Deep Research stacks up against the tools people compare it to most.

ToolUnderlying ModelHumanity's Last ExamTime / ReportEntry PriceBest At
OpenAI Deep Researcho3 (deep research)~26.6%5–30 min$20/moDeepest, best-structured long reports
Google Gemini Deep ResearchGemini 2.5 Pro~mid-20s%5–15 min$20/mo (One AI)Breadth + Google ecosystem integration
Perplexity Deep ResearchMulti-model~21%1–3 minFree / $20 ProSpeed and a usable free tier
xAI Grok DeepSearchGrok 4~25–38%*2–10 min$30/moReal-time X / social signal
Anthropic Claude ResearchClaude Opus 4.x~mid-20s%3–15 min$20/moCareful reasoning, fewer hallucinated cites
GPT-Researcher (open source)Bring-your-ownn/aVariesFree + APISelf-hosted, scriptable pipelines

Figures are mid-2026 estimates blended from vendor announcements (OpenAI, Google, Perplexity, xAI, Anthropic) and the Humanity's Last Exam public results. *Grok scores vary widely with tool use enabled vs disabled. Benchmark numbers shift by harness and scaffolding; entry price reflects the lowest paid tier that unlocks deep research.

The honest read: OpenAI Deep Research wins on depth and structure, Perplexity wins on speed and price, and Gemini wins if you already live in Google Docs and Sheets. None of them "win" outright — serious users run two and route by task. For an investor mapping a sector, OpenAI's thoroughness is worth the wait; for a quick fact-check, Perplexity's sub-3-minute turnaround usually beats it.

OpenAI Deep Research Pricing: What $20 vs $200 a Month Buys

OpenAI Deep Research has no standalone price — it is bundled into ChatGPT subscriptions, and the tier you pick controls how many reports you can run per month rather than which features you unlock. The expensive plans buy volume and priority, not a different product.

PlanPriceDeep Research / MonthWho It's For
ChatGPT Free$0Limited lite versionTrying it out occasionally
ChatGPT Plus$20/mo~25 full runsIndividuals, regular research
ChatGPT Team$25/user/mo~25 + shared workspaceSmall teams with admin needs
ChatGPT Pro$200/mo~250 full runsPower users, daily reports
ChatGPT EnterpriseCustomHigh / negotiatedOrgs needing SSO + controls
Deep Research APIPer-tokenUsage-basedEmbedding research in products

Figures are mid-2026 estimates based on OpenAI's published ChatGPT pricing and stated deep research usage allowances, which OpenAI has revised upward several times since the February 2025 launch. Lite (o4-mini-class) runs are counted separately and have higher limits at every tier.

The math is what makes this disruptive. A single junior analyst costs $80,000 to $120,000 in base salary and well over $100,000 fully loaded. The $200 Pro plan — about $2,400 a year — runs roughly 250 deep reports a month, or 3,000 a year. Even if you throw away half the output as not-good-enough, the cost per usable research artifact lands in single-digit dollars. That gap is exactly why operators are rethinking how many bodies a research function needs, a shift we track across our AI Valuations dashboard.

How Good Is It, Really: Benchmarks and the Hallucination Problem

The benchmark that made people pay attention was Humanity's Last Exam — roughly 3,000 expert-level questions across more than 100 subjects, designed to be brutally hard. OpenAI Deep Research scored about 26.6%, against 3.3% for GPT-4o and around 9% for o1 without browsing. An 8x jump over GPT-4o is real progress.

GPT-4o

3.3%

No browsing/tools

o1

~9%

Reasoning, no web

Deep Research

26.6%

Agentic web research

Human expert

~90%+

Domain specialist

But read 26.6% the other way: it still gets roughly three out of four hard questions wrong. And the failure mode is dangerous because it is confident. Deep Research can write a fluent, well-cited paragraph where the citation only loosely supports the claim — or where a real source is summarized slightly wrong. OpenAI's own launch notes flagged that it can still hallucinate facts and struggle to distinguish authoritative sources from rumor. The practical rule is simple: treat every report as a strong first draft from a fast, tireless researcher who occasionally makes things up. You verify before you act.

Who Should Use the OpenAI Deep Research Feature — and Who Shouldn't

Deep Research shines on questions that are wide, public, and synthesizable: market landscapes, competitive teardowns, literature reviews, regulatory summaries, and first-draft investment memos. For a VC, "map every funded company building AI inference chips, with funding, backers, and a one-line thesis on each" is a near-perfect prompt — exactly the kind of grind that used to eat an analyst's afternoon. The more specific your brief and the more the answer lives in public text, the better it does.

It is a poor fit where the answer isn't on the open web. Private market data behind paywalls, proprietary datasets, anything requiring original primary interviews, or fast-moving stories where the truth changed an hour ago — these expose its limits. It also can't be accountable: it won't get fired for a wrong number, won't sit in the partner meeting, and won't know which of two conflicting sources your firm trusts. That judgment layer is still yours.

The real shift for founders and operators is what it does to the org chart. When one prompt produces what took a junior a day, the bottleneck moves from gathering information to specifying questions well and verifying answers fast. That doesn't zero out the analyst role — it raises the floor on it, the same way coding agents raised the floor on engineers. We dig into how that capital efficiency is re-rating software margins on the AI Spending dashboard.

OpenAI Deep Research isn't a smarter search box.

It's a research analyst you delegate to — at $20 to $200 a month instead of a $100K+ salary.

It scores 26.6% on the hardest benchmark we have, writes in minutes what took a person a day, and still gets enough wrong that a human has to check it. The teams winning with it aren't the ones who trust it blindly — they're the ones who learned to ask precise questions and verify the answers fast.

Explore Related Dashboards

Interactive tools with live data on this topic

💸
AI Spending Tracker
Big tech AI capex spending in real time
📈
VC Performance
Fund returns, DPI, and benchmark data
🧮
Startup Valuation Calculator
Estimate pre-money valuation by stage

Track frontier AI models, agent economics, and valuations across OpenAI, Anthropic, Google, and xAI on the AI Valuations Dashboard and AI Spending Dashboard at Value Add VC. Originally published in the Trace Cohen newsletter.

ShareXLinkedInEmail

Frequently Asked Questions

What is the OpenAI Deep Research feature?

OpenAI Deep Research is an agentic mode inside ChatGPT, launched February 2, 2025, that autonomously browses the web for 5 to 30 minutes, reads dozens of sources, and writes a structured, fully cited report. It runs on a version of OpenAI's o3 reasoning model and is built for multi-step research questions, not quick chat answers. You give it one detailed prompt and walk away while it works.

How much does OpenAI Deep Research cost in 2026?

OpenAI Deep Research is bundled into ChatGPT plans, not sold separately: $20/month on Plus, $25/user on Team, and $200/month on Pro, with Enterprise priced custom. The tier determines how many deep research tasks you can run per month — roughly 25 on Plus and 250 on Pro as of 2026 — plus access to the faster, lighter version. There is no standalone Deep Research subscription.

How accurate is OpenAI Deep Research?

On Humanity's Last Exam, a benchmark of expert-level questions across more than 100 subjects, OpenAI Deep Research scored about 26.6% — versus 3.3% for GPT-4o and roughly 9% for o1 without browsing tools. That is a large jump, but it still gets roughly three of four hard questions wrong, and it can present confident citations that don't fully support its claims. Every report needs a human fact-check before you rely on it.

Is OpenAI Deep Research better than Gemini or Perplexity Deep Research?

It depends on the job. OpenAI Deep Research generally produces the most thorough, well-structured long reports and leads on Humanity's Last Exam, but it is slower (5–30 minutes) and gated behind paid ChatGPT tiers. Perplexity Deep Research is far faster (often under 3 minutes) and has a free tier; Google Gemini Deep Research is strong and bundled into Google One. For depth you pick OpenAI; for speed and cost you pick Perplexity.

Can OpenAI Deep Research replace a junior analyst?

It can replace a large share of the research tasks junior analysts do — literature scans, market sizing, competitive landscapes, and first-draft memos — at a cost of $20–$200/month versus a fully loaded junior salary often above $100,000 a year. What it cannot replace is judgment, source vetting, original primary research, and accountability. In practice it shifts the human role from gathering information to verifying and deciding on it.

Related Tools & Dashboards

🤖AI Valuations💸AI Spending📈VC Performance

Keep Reading

💻OpenAI Codex 2026: The AI Coding Agent Explained — What It Does and How It Compares🔍Perplexity AI Valuation and Business Model: The Search Engine That's Replacing Google for Some Users🖥️Claude Computer Use: The API Feature That Lets AI Control Your Desktop

Explore 45+ free VC tools, dashboards, and recommended startup software.

Explore DashboardsHelpful Apps & Platforms

Trace Cohen is a serial founder, investor and data geek. Please feel free to reach out t@nyvp.com

VC
Value Add VC
Helpful AppsTwitterContact