Cloudflare announced on July 1, 2026 that it is drawing a hard line between traditional search crawlers and AI crawlers used for training or agentic retrieval. Starting September 15, Cloudflare's default settings will block 'mixed-use' crawlers — bots that claim to serve search but also feed AI training or agent products — from accessing any page that carries advertising, unless the AI company has a compensation arrangement with the publisher. The new defaults apply automatically to all free-tier customers and every new site, meaning millions of publishers get the protection without lifting a finger.
The policy builds on Cloudflare's 2025 'Pay Per Crawl' feature, which let publishers charge a fee each time an AI bot fetched their content. That model drew criticism for paying publishers regardless of whether the crawled content actually appeared in, or influenced, an AI answer. The evolution — 'Pay Per Use' — instead compensates publishers when their content demonstrably creates value: when it shows up in AI search results or gets accessed to answer a specific query. Cloudflare is launching the model with two initial partners, Ceramic.ai and You.com.
The move is a direct response to years of publisher complaints that AI labs have scraped copyrighted content at scale without permission or payment, hollowing out the traffic and ad revenue that used to flow from search referrals. Cloudflare sits in front of a meaningful share of the web's traffic, giving it outsized leverage to force the issue where lawsuits (New York Times v. OpenAI, and others still working through courts) have moved slowly.
“The policy builds on Cloudflare's 2025 'Pay Per Crawl' feature, which let publishers charge a fee each time an AI bot fetched their content.”
The numbers in context: this is a default-on policy change affecting Cloudflare's entire free customer base — likely millions of sites — not an opt-in program that only motivated publishers would adopt. That scale is what separates it from prior individual publisher-AI licensing deals (News Corp-OpenAI, Axel Springer-OpenAI, Reddit-Google), which were negotiated one at a time and covered a tiny fraction of the open web.
For founders and operators, the implication depends entirely on which side of the transaction they sit on. AI-native startups that depend on open-web data for training or retrieval now face a real, dated cost line they didn't have to budget for six months ago, while publisher-adjacent businesses — content platforms, media companies, anyone monetizing owned content — just got a structural tailwind they didn't have to lobby for. LPs with exposure to either side of that divide should expect margin impacts to show up within two quarters of the September 15 deadline.
What to watch: how AI labs respond — whether they negotiate bulk licensing deals with Cloudflare-fronted publishers or route around the block with alternative crawling infrastructure, whether other CDNs and infrastructure providers (Akamai, Fastly) adopt similar defaults, and whether the September 15 deadline triggers a wave of last-minute publisher-AI licensing announcements.