The New York Times has moved to amend its copyright lawsuit against OpenAI and Microsoft, sharpening its central argument in a way that puts Microsoft's own conduct -- not just OpenAI's models -- squarely in the crosshairs. In the amended complaint, the Times alleges Microsoft did far more than supply generic cloud computing: it purpose-built a supercomputer, reportedly comprising on the order of 285,000 CPU cores and roughly 10,000 GPUs, specifically to enable OpenAI to train its models on vast quantities of copyrighted material, including the Times's journalism.
The suit is not new -- the Times filed its original complaint on December 27, 2023, accusing the two companies of using millions of its articles without permission to build ChatGPT and related products. What changed is the framing. After a court ruling shifted the legal standard for the case, the Times narrowed its claims, dropping a pair of allegations including trademark dilution, and refocused its contributory-infringement theory on the specific machinery and decisions that made large-scale training possible. The message is that the infrastructure provider is not a passive bystander.
This is the bellwether case for the entire generative-AI industry's relationship with the data it was built on. The Times was the first major news organization to sue, and its complaint has become the template that authors, artists, music labels and other publishers have followed. How a court treats the argument that building and operating training compute can constitute contributory infringement will ripple through dozens of parallel suits and set the terms on which models can legally ingest copyrighted work.
“The message is that the infrastructure provider is not a passive bystander.”
The stakes are existential for the economics of frontier AI. OpenAI and its peers have largely argued that training on publicly available text is transformative fair use. If courts instead find that the act of assembling the compute to copy and train on protected works exposes both the lab and its infrastructure partner to liability, the cost of training could balloon through licensing deals, settlements or damages. Microsoft, which has poured tens of billions into OpenAI and built much of its Azure AI business around the partnership, has more to lose from an adverse infrastructure ruling than almost anyone.
The competitive context matters too. Several labs have already chosen to pay rather than fight: OpenAI itself, Google, Anthropic and others have signed licensing agreements with publishers and platforms, and the Times's escalation raises the price of holding out. A win for the Times would accelerate a shift toward a licensed-data regime that favors the best-capitalized incumbents -- the very companies that can afford nine-figure content deals -- while squeezing smaller labs and open-weight challengers that rely on scraped corpora.
For founders and investors, the read is that 'data provenance' is moving from a compliance footnote to a core diligence question. Startups building on top of foundation models inherit the legal exposure of whatever those models were trained on, and acquirers are increasingly asking where the training data came from. The same week the federal government began vetting who can access the most capable models, the courts are testing who had the right to build them in the first place -- two different chokepoints tightening on the same industry.
The bear case for over-reading it: this is a single amended complaint in a years-long case, fair-use doctrine has historically given transformative uses wide latitude, and the parties may yet settle, as many AI copyright disputes have. A licensing detente that lets the Times monetize its archive and lets OpenAI keep shipping is a plausible -- even likely -- endgame. What to watch: how the judge treats the contributory-infringement theory against Microsoft specifically, whether other publishers amend their complaints to mirror it, and whether the threat of infrastructure liability pushes the remaining holdout labs toward the licensing table.