Creative Commons (CC) licenses have become the lingua franca of sharing in the digital age. From research articles to photographs to slide decks to learning objects, they are the scaffolding of an open culture that invites reuse, remixing, and redistribution. Their icons — those familiar CC circles — signal permission and generosity, the optimism of a democratizing Internet where information flows freely for the benefit of all.

Yet, for all their ubiquity, CC licenses are also misunderstood. The very utility that makes them powerful — their ability to simplify and pre-authorize use — can obscure a key fact: a CC license can only grant rights that users otherwise would not have. It cannot restrict what is already allowed under fair use, right of first sale, or other copyright exceptions or limitations. Increasingly, it seems this distinction is being lost in public and academic discussions of open access, particularly in debates over the use of scholarly materials to train large language models (LLMs).

Painting titled Porte de la Reine at Aigues-Mortes
Porte de la Reine at Aigues-Mortes, Jean-Frédéric Bazille,1867

To understand the confusion, it helps to recall the historical moment in which Creative Commons emerged. In the early 2000s, the vision of open access was one in which “The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.” But the default legal position of copyright — “All rights reserved” — did not support this ideal. Creative Commons licenses respond to this challenge by offering a standardized legal tool that allows creators to proactively grant permissions to the public.

The success of CC licenses is undeniable in many ways. They are now widely used across many fields and have enabled creativity and knowledge development. Musicians share remixes, educators more easily use materials in courses, and scholars can freely share articles unrestricted by institutional subscriptions and paywalls. But the simplicity of CC licenses — the ease of choosing “BY,” “SA,” or “NC” from a dropdown — also meant that users can easily forget the deeper legal terrain beneath them. A CC license does not replace copyright law; it operates within it. It tells users, “here is what you can do without asking,” but it does not take away what they could have done if there were no CC license applied.

Fair Use Still Stands

Copyright, at its core, is about control. It grants creators exclusive rights to reproduce, distribute, and adapt their works. But that exclusivity has always been tempered by limitations and exceptions, such as fair use, which protect the public’s ability to quote, parody, critique, or engage in transformative uses of copyrighted materials without permission.

When a copyright owner applies a CC license, they are not rewriting these boundaries. Instead, they are expanding the set of authorized uses beyond those boundaries. That is why every CC license — from the most permissive (CC BY) to the most restrictive (CC BY-NC-ND) — is a liberalization relative to the default copyright. None of the CC licenses can make a use less legal than it would have been under copyright alone.

Consider the case of accessibility. U.S. law allows the creation of accessible versions of copyrighted works (for instance, converting text to braille or audio) for people with disabilities. A creator cannot use a CC license — or any license — to revoke or limit that right. Likewise, if someone quotes a passage of a copyrighted article in a critical essay, that is covered by fair use regardless of any CC license. A CC license grants additional permissions; it does not add new restrictions.

The LLM and Open Access Debate

Nowhere is this confusion of what a CC license can do more visible than in current debates about the use of scholarly works to train AI systems. Many academics and publishers who release work under Creative Commons licenses — especially the Non-Commercial or No Derivatives variants — are alarmed to discover that their content may still be scraped or analyzed by companies developing large language models. Some have reacted by suggesting that their chosen CC BY-NC license forbids such use by for-profit entities.

But legally, this is not how it works. A CC license can only expand the terms of use — essentially, what you can do without seeking the additional permission that you would need under copyright. But if a company’s data-mining of text for model training qualifies as fair use under copyright law, the existence of a CC license neither grants nor removes that right. The CC framework simply does not operate in the realm of prohibition; it is a tool for pre-authorized permission.

Similarly, one is allowed to read a copyrighted text without citing it when one later discusses or writes about the topic (we all read much more that influences our thinking than we ever cite!). Reproducing such a copyrighted text would require permission, though, and so if a CC-licensed text is reproduced, then the terms of the CC license apply.

As Creative Commons itself explains:

You can use CC licenses to grant permission for reuse in any situation that requires permission under copyright. However, the licenses do not supersede existing limitations and exceptions; in other words, as a licensor, you cannot use the licenses to prohibit a use if it is otherwise permitted by limitations and exceptions to copyright.

In short: a CC BY-NC license can forbid commercial reuse (e.g., republishing an article in a paywalled anthology), but it cannot make a legally permitted fair use by anyone – including a for-profit entity – suddenly illegal. The CC license sits atop the copyright system — it does not rewrite its foundation.

Why the Distinction Matters

Getting this right is more than a technicality. It shapes how educators, researchers, and institutions think about intellectual freedom and collaboration. When creators incorrectly assume their CC license allows them to block fair use, they inadvertently invert the purpose and intention of the licenses. Understanding CC licenses as permissions beyond limitations and exceptions acknowledges both their intended spirit and their actual legal effect. Copyright’s foundation includes both exclusive rights and built-in freedoms. CC licenses expand one side of that balance — they do not erase the other.

As digital platforms, AI models, and remix culture continue to evolve, the role of Creative Commons licenses will remain — but so will the need for legal literacy. Educators, librarians, and researchers need to remember that openness is not merely about generosity; there is a legal architecture. A CC license structures permission in advance, but the underlying scaffolding of fair use and other exceptions is also in force. The promise of CC has always been to build a commons—an ecosystem where knowledge, art, and creativity can circulate without friction. In a world increasingly anxious about who controls information, CC licenses must be correctly understood. They are useful for liberalizing the use of copyrighted materials; they have no role in constraint.

Lisa Janicke Hinchliffe

Lisa Janicke Hinchliffe

Lisa Janicke Hinchliffe is Professor/Coordinator for Research Professional Development in the University Library and affiliate faculty in the School of Information Sciences, European Union Center, and Center for Global Studies at the University of Illinois at Urbana-Champaign. lisahinchliffe.com

Discussion

4 Thoughts on "Can a CC License Constrain Fair Use or Other Copyright Limitations or Exemptions?"

In Canada, we have “fair dealing” which is similar to but not identical to the US’s “fair use”.
A really huge debate here which is basically waiting for a Supreme Court of Canada case to resolve, is whether fair dealing rights can be waived (lost) when there is a license involved. The obvious situation is a license signed between a university and publisher for full text access to their “Big Deal” journals that may prevent the library from sharing those articles via interlibrary loan (“ILL”) or copying PDFs to post in a course management system for reading by a single course.
While I had never thought of CC licenses before in this context, I don’t see any legal reason that they would be any different from commercial publisher/vendor licenses. If the particular license explicitly says that the license does not restrict fair use/fair dealing (FU/FD) rights but just grants additional permissions, then there is no point to this discussion, because there is no conflict between FU/FD and the license. CC’s statement about that may also carry a lot of weight if there were a legal case. But if the specific CC license does not explicitly itself state that it includes all FU/FD rights, the legal situation is not obviously resolved, at least not in Canadian law.

There are a lot of reasons that CC licenses are different than a publisher/library license, but most importantly, that the latter is laying out conditions for gaining access to materials, whereas CC licenses do not regulate access but rather what additional permissions have been granted for use of copyrighted materials that one has access to. It is indeed the case that the issue of how expansive contract override is relative to copyrighted materials and fair use/fair dealing is not resolved. If it turns out that a publisher/library contract restricts fair use or other copyright limitations or exceptions, it is still the case that any applied CC license to such texts would grant additional permissions and not constrain them.

As an early champion of open access going back to the days when I served on the steering committee of the project known as University Publishing in an Electronic Age under the auspices of the Committee on Institutional Cooperation (renamed the Big Ten Academic Alliance after the University of Chicago dropped out) in the early 1990s, I welcomed the creation of the Creative Commons licensing system and the more than 100 articles of mine posted at the library website of Penn State University known as ScholarSpehere are all availabl e via a BY-NC-ND license. They include my article about the development of this proto-OA CIC project: https://scholarsphere.psu.edu/resources/508e16e9-fb58-4f9d-821d-21195008976d. Lessons from this project led me in 1996 to write an article titled “A Nonmarket Solution for Scholarly Publishing”: https://scholarsphere.psu.edu/resources/76393b74-5b4e-4899-bb76-9822c8424d94. (I later learned that this essay helped inspire Frances Pinter to launch Knowledge Unlatched.) In 2005 we launched the Office of Digital Scholarly Publishing as a joint effort of the library and university press (where I was director) to publish several OA monograph series. Two years later, as president of the Association of American University Presses (AAUP), I wrote its Statement on Open Access: https://scholarsphere.psu.edu/resources/84e1a068-ba42-427b-975b-f4d1392d6ec2 This was intended to foster more OA initiatives in monograph publishing to complement the efforts in journals publishing promoted by the Budapest Open Access Initiative in 2001. These new series, too, benefited from the CC licensing system.

I was not unaware of the relation bishop between open access and fair use, however. In 1973, as a member of the Copyright Committee of the Association of American Publishers (AAP) and representing the AAUP also, I testified before the Senate Judiciary Subcommittee on Patents, Trademarks, and Copyrights that was drafting what became Section 107 of the Copyright Act of 1976.
if Congress had adopted the language proposed by our testimony, which would have made the fourth factor of market impact preeminent with the other three factors subservient to it, there would have been much less litigation over copyright and the rights of creators would have been much better protected than has turned out to be the case

As a member of the board of directors of the Association for Copyright Enforcement from 1988 to 1994, I helped oversee the management of a landmark case in the publishing industry, a suit (American Geophysical Union et al. v Texaco, Inc.) brought by 84 publishers against Texaco that challenged its claim of fair use pertaining to photocopying. After losing multiple appeals (and being in bankruptcy for several years), Texaco finally reached a settlement with the plaintiff publishers and entered into a licensing agreement with the copyright Clearance Center whose board I had joined in 1994. In his ruling against Texaco, Judge Jon Newman stressed that while photocopying may have great “social utility” in facilitating creative acts, it is not itself a creative act and therefore cannot be construed as fair use. This case, for which Pierre Leval was the district court judge in the Second Circuit, prompted him to develop the idea of “transformative use” in his 1990 Harvard Law Review article titled “Toward a Theory of Fair Use,” which then was cited by the US Supreme Court in its decision in Campbell v Acuff-Rose (1994) regarding the 2 Live Crew parody of the song “Pretty Woman.” What may be considered transformative is now at the heart of the many AI cases before the courts including the New York Times case in the Second Circuit. I traced the evolution of this concept as applied by the courts in this article in 2019: https://scholarsphere.psu.edu/resources/460d0813-f82b-400a-abc8-137bf9d1f647. It points to a potential showdown at the appeals level in the Second Circuit between two competing interpretations of what transformative use can cover.

As the first case to be decided that involves AI in relation to copyright jurisprudence, Thomson Reuters v ROSS Intelligence merits close attention, but at the same time, as Matthew Sag notes in his commentary, this decision is not likely to be too worrisome for OpenAI and Microsoft as defendants in the suit brought against them by the New York Times, nor for most of the defendants in the 30 other cases pending in the courts, because this case bypasses the critical question of what “transformative use” really means. Judge Bibas, in his closer look at the facts, determined that this was really just a case about competing markets and therefore wrote his summary judgment to reflect his view that the fourth factor of fair use, market impact, should be given greatest weight.

Ever since Campbell v Acuff-Rose, however, this doctrine of transformative use has changed the landscape of fair use jurisprudence in profound and sometimes troubling ways, which are now showing up in almost all of the AI cases being litigated, but not this one in Delaware.

The basic issue being argued between plaintiffs and defendants is how “social utility” is being factored in, or not. Plaintiffs argue that an act is “transformative” if and only if it can be construed as itself creative, as an act of parody clearly is. Defendants argue that it is enough that the act being questioned as infringing facilitates truly creative acts later on, i.e., makes possible those creative acts that otherwise might be difficult or impossible to achieve. In the landmark Texaco case in the Second Circuit, Judge Jon Newman had declared that even though photocopying made it easier for the Texaco scientists in their labs to do their creative work and therefore had “social utility” (the very word he used), photocopying itself was simply a mechanical process involving no creativity on its own—thereby upholding the very argument that Leval had made in the case at the district court level.

But then, in the early 2000s, a series of court decisions in the Ninth Circuit, known to be sympathetic to its Silicon Valley tech companies, adopted the alternative view, that social utility in facilitating later creativity suffices to render a use transformative, no matter what effect on the market might be for the copyright works whose infringement was being claimed.

This interpretation then filtered back into the Second Circuit by way of the HathiTrust and Google Books cases decided by district judge Denny Chin. Surprisingly, because it would appear that this new interpretation differed from his own original idea, Judge Leval sided with Chin when the cases were appealed as he was then sitting on the appeals court in the Second Circuit where in the meantime Judge Chin has joined him.

I do know that Judge Leval still considers the fourth factor important because when I asked him about the ARL’s interpretation of fair use in its statement issued in 2012, where “repurposing” of works like scientific articles and novels, which are not intended in the first instance for use by undergraduate students, could be copied without permission from the publishers in whole or in part within the scope of transformative use, Leval told me that this would be a step too far and that the fourth factor should prevail in such cases.

The New York Times suit, be it noted, was filed in the Second Circuit, so we now face the eventuality of the case being heard on appeal by both Judge Leval and Judge Chin. Which Judge Leval will show up on that day? And when the US Supreme Court finally hears the case, as it surely will, will deference be paid to the Campbell interpretation, which will favor the New York Times, or will Judge Chin’s interpretation carry the day, favoring the tech companies? Who knows? In other words, which Secind Circuit precedent will be given more credence, Texaco or Google Books? A lot of money rides on the outcome.

Whatever the outcome, it remains true what my colleague Leon Seltzer, then director of Stanford University Press, said about the 1976 Copyright Act when it finally passed Congress and came into effect at the beginning of 1978: “[Congress] has failed to articulate a coherent rationale for copyright, it has failed to define fair use, it has introduced confusions between fair use and exempted use, and it has in the end tossed the fair use question, now thoroughly enmeshed in contradictions, back to the courts.” As Georgia Harper has since characterized the way courts have handled this mess, it would appear that judges first make up their minds about what they think is fair and then use the four-factor analysis to justify, retroactively, what they have decided on independent grounds.


A further interesting iteration of this story is what is now happening globally with AI. We have seen a Chinese company enter the stage with a generative AI system that can do what OpenAI’s ChatGPT does, but at much cheaper cost. One wonders how China’s copyright law, where transformative use plays no role so far as I am aware, enters into the equation. For an early review of how China got onto the copyright atage, see my “China’s Copyright Dilemma” originally published in Learned Publishing in 2008 and accessible now here: https://scholarsphere.psu.edu/resources/23eb4349-700a-4298-853a-70fda26524b7

Thank you very much for this article and for its clarity. It articulates something that I think many people struggle to understand. I would add a corollary to this argument: *If* the fair use doctrine allows text and data mining for use in LLMs, then less restrictive Creative Commons licenses (for example, CC BY) are not what create the legal justification for using those works as training data for LLMs. Put more plainly, CC BY licenses are not to blame for OpenAI’s use a scholar’s publication as training data.

Leave a Comment