Copyright’s Big Win in the First Decided US Artificial Intelligence Case

Back in March of 2023, when there were only a handful of cases alleging copyright infringement for training purposes by AI companies, I predicted that we would soon have some guidance from the court in Thomson Reuters Enterprise Center GMBH and West Publishing Corp. V Ross Intelligence, Inc. Predicting the timing of court decisions is a fool’s errand, and this fool was repeatedly wrong in his predictions on timing. Nonetheless, on February 11, the Ross case did in fact become the first US decision on the merits to directly address copying to train AI. Now we have a clear decision, and it is favorable for rightsholders.

The case arose out of the surreptitious copying of the entire Westlaw database (after having been denied a license) by a company that wanted to create an arguably competing product. Unlike some of the generative AI cases, there was no claim that the AI’s output included the copyrightable content. As the Court noted, this was not about generative AI. This case was purely about training.

A large portion of the decision concerns the question of whether the defendant copied material that is copyrightable, given that West’s copied “headnotes” and “Key Number System” are factual and based on the unarguably public domain caselaw. I will largely skip the discussion of copyrightability as it is particularly related to these facts, although I did enjoy the almost poetic nature of the inquiry:

More than that, each headnote is an individual, copyrightable work. That became clear to me once I analogized the lawyer’s editorial judgment to that of a sculptor. A block of raw marble, like a judicial opinion, is not copyrightable. Yet a sculptor creates a sculpture by choosing what to cut away and what to leave in place. That sculpture is copyrightable…. So too, even a headnote taken verbatim from an opinion is a carefully chosen fraction of the whole. Identifying which words matter and chiseling away the surrounding mass expresses the editor’s idea about what the important point of law from the opinion is.

More than copyrightability, to me the most important question is “was the copying fair use?” The answer is “no.”

Fair use analysis requires a review of four factors, and the same acts can be considered fair if done by one party (such as a student) and not fair use if done by another for the benefit of the first (e.g., a copy shop).

In this case, the Court found that two factors favored the plaintiff, and two the defendant. Fortunately for Thomson Reuters, the two in its favor tend to be viewed as more important.

Factor one is the “nature of the use.” The Court looked “mainly at whether [the use] was commercial and whether it was transformative.” There was no dispute that the use was commercial. The Court also found that the use was not transformative. The reasoning is a bit complicated, but the Court seemed largely swayed by the Andy Warhol case, which, as I mentioned in a previous post, has been interpreted as putting the transformative use “blob back in the bucket.” The court was also swayed by the fact that ultimately defendant would have competed with plaintiff. Competition will, of course, be a mainstay of the pending AI litigations. Courts will need to evaluate others forms of competition, including without limitation (1) whether the unlicensed use of materials in training competes with licensed use of such materials, (2) whether the unlicensed use of materials in training competes with offers of licenses for use of such materials, and (3) whether the output competes with the input in a generative AI context.

In finding that factor two, the nature of the work, favored Ross, the Court noted “Westlaw’s material has more than the minimal spark of originality required for copyright validity. But the material is not that creative.” Likewise, factor three (substantiality of the amount copied) favored Ross largely because the copied works were not in the output. I disagree with the Court on this one for what it’s worth. Ross copied entire works for training.

Copyright lawyers know the most important factor is factor 4, “market harm.” Here, Ross was destroyed. First, it intended to create a competing product. Second, the Court noted that it ‘must consider not only current markets but also potential derivative ones “that creators of original works would in general develop or license others to develop.’” The burden of proof was on Ross to show that there was no licensing market, and it failed to do so. With new agreements regularly announced between AI firms and publishers, and with AI rights being made available under collective licenses, this burden will become harder for defendants.

What does it mean?

I have been surprised in the last few years by the willingness of people who know better to declare unconditionally that the “making of copies to train AI is categorically fair use.” That is not how copyright works. Fair use requires an analysis of case-specific facts. With the first case now decided, hopefully that line will be silenced.

For those who wish to minimize the impact, the first argument will be “this is just one district court, and there may be an appeal.” True enough, but that does not end the inquiry. Courts are free to consider any cases they wish.

The next line of argument — which I have already heard — is “this case does not involve generative AI, so it does not matter to those cases.” While it is true that this is not a generative AI case, I cannot see how that makes this case less impactful. Frankly, if there was output that competed with the input, I think the Court would have had an easier time simply finding infringement. All AI starts with the type of copying and training done by Ross. Some AI systems then make further copies. The Ross case might not be the same as a generative AI case, but it is more similar than anything else decided to date. This case involves training, as do the other pending AI cases.

Does this mean that the plaintiffs will win all of the generative AI cases filed? Of course not. That is not how copyright works. Fair use requires an analysis of case-specific facts. Still, defendants now have a much higher burden to overcome and will need to find ways to distinguish their cases from this one.

Roy Kaufman

Roy Kaufman is Managing Director of both Business Development and Government Relations for the Copyright Clearance Center (CCC). Prior to CCC, Kaufman served as Legal Director, John Wiley and Sons, Inc. He is a member of, among other things, the Bar of the State of New York, the Author’s Guild, and the editorial board of UKSG Insights. Kaufman also advises the US Government on international trade matters through membership in International Trade Advisory Committee (ITAC) 13 – Intellectual Property and the Library of Congress’s Copyright Public Modernization Committee in addition to serving on the Board of the United States Intellectual Property Alliance (USIPA).

Discussion

3 Thoughts on "Copyright’s Big Win in the First Decided US Artificial Intelligence Case"

“(2) whether the unlicensed use of materials in training competes with offers of licenses for use of such materials” I know there have been cases in the US that have ruled this way, but it is horrifically wrong. It posits that if every rightsholder participated in a system to provide reasonable paid licenses, then Fair Use would just cease to exist. The mere existence of an option to pay for such a license must NOT be used against a finding of Fair Use itself, since the very point of Fair Use is that a license is not necessary and the rightsholder is NOT entitled to any payment (under the appropriate circumstances dictated by the 4 factors, of which this argument cannot be one).