Crucial Victory for Tech Companies Using Copyrighted Works for AI Training

On June 23, 2025, the Northern District of California issued a critical decision applying the fair use doctrine to copyrighted works for generative AI. It held that the unauthorized use of plaintiffs’ books for AI training does not constitute copyright infringement under certain circumstances. Bartz v. Anthropic PBC, Case 3:24-cv-05417-WHA (N.D. Ca.).

Defendant Anthropic PBC is an artificial intelligence company whose core product Claude generates human-like text mimicking reading and writing. Claude generates over $1 billion in annual revenue. To create Claude, Anthropic acquired millions of copyrighted books to train large language models (“LLMs”). It first downloaded books in digital form from pirate sites instead of buying them to avoid the “legal/practice/business slog”, as explained by its cofounder. At some point it became concerned about continuing this practice and hired Google’s former head of book scanning to obtain “all the books in the world”. It spent millions to purchase used books, which were then stripped from their bindings, cut to size, and scanned into digital form, discarding the originals and resulting in PDF copies with machine-readable text.

Anthropic used these acquisitions to create a central research library and to train LLMs. The books plaintiffs/authors had written were especially valuable because of their creative expression. Claude’s customers demanded accurate and compelling writing. Each work was copied and compressed, and the LLMs memorized these works. However, Claude had a built-in filter that prevented the output of these works. Claude could help less capable writers create works as well-written as the authors’ and compete in the same categories, but Claude created no exact copy or substantial knock-off.

The three plaintiff authors filed this class action alleging that Anthropic infringed their copyrights. Anthropic moved for summary judgment only on fair use. Plaintiffs contend that Anthropic copied their books for two uses: (1) to build a vast central library of useful content and (2) to train specific LLMs. They also complain that the print-to-digital format itself was an infringement. After a lengthy in-depth analysis of the doctrine of fair use, the Court ruled:

Copies used to train LLMs were justified as fair use. “The technology was among the most transformative many of us will see in our lifetimes”.
The copies used to convert purchased print library copies into digital library copies constituted fair use; and
Use of the downloaded pirated copies was not justified by fair use.

Plaintiffs argued that Anthropic infringed on their works by using them to train Claude to produce other works that would compete with them. In rejecting this position, the Court explained that the LLMs did not reproduce to the public any given work. Instead, Claude outputs grammar, composition, and style that the LLM distilled from thousands of works.

But if someone were to read all the modern-day classics because of their exceptional expression, memorize them, and then emulate a blend of their best writing, would that violate the Copyright Act? Of course not. Copyright does not extend to methods of operation, concepts, or principles illustrated or embodied in a work.

It emphasized that the purpose and character of using copyrighted work to train LLMs to generate new text was “quintessentially transformative – spectacularly so”. It also ruled that for the purchased copies converted from print to digital, the format change constituted fair use. Anthropic destroyed each print copy and replaced it with a digital copy to ease storage and enable searchability. “Storage and searchability are not creative copies of the copyrighted work itself but physical properties of the frame around the work. . .”

Concerning piracy, Anthropic initially downloaded over seven million copies of books, paid nothing, and kept them in its library. “The person who copies the textbook from a pirate site has infringed already, full stop”. Piracy was the point – to build a library without paying for it. A trial will be held on the pirated copies used to create Anthropic’s central library, along with resulting damages, actual or statutory, including for willfulness. Plaintiffs are expected to appeal. Despite this win, Anthropic still faces a potentially huge judgment when the issue of the copyrighted works it “stole” is tried.

This is a significant development for the tech industry, which faces similar lawsuits brought by authors and news outlets, including the New York Times. On June 25, 2025, a different judge ruled that Meta’s use of copyrighted books for AI training constituted fair use. Thirteen authors sued Meta for downloading their books to train its AI model Llama. The Court found Meta’s use highly transformative but only entered summary judgment in its favor because plaintiffs failed to present any evidence on the “potentially winning argument” that Meta will likely flood the market with similar works causing dilution. Kadrey v. Meta Platforms Inc., 2025 WL 1752484 (N.D. Ca. June 25, 2025).

Interestingly, it criticized the Anthropic decision described above because that Court “brushed aside” the markets for the works it gets trained on. It noted that using books to teach children to write is not remotely like using books to create a product that enables a person to generate countless competing works with a miniscule fraction of the time and creativity it would otherwise take. “This inept analogy is not a basis for blowing off the most important factor in the fair use analysis”.

About Jaqueline A. Criswell

Jacqueline Criswell chairs the firm’s Intellectual Property group and has been a commercial litigator for more than 30 years. She practices in all aspects of intellectual property law, including trademarks, trade dress, misappropriation of trade secrets, copyrights, e-commerce, and false advertising matters. Jackie acted as lead trial counsel in cases involving novel issues, including the interplay of permissible parody and the First Amendment, functionality of product designs, copyright protection for useful articles, and Digital Millennium Copyright Act damages. She also litigates claims for misappropriation of trade secrets against former employees and competitors, as well as piracy and economic espionage. Click here to read Jackie’s full attorney bio.

Crucial Victory for Tech Companies Using Copyrighted Works for AI Training

Categories