On Friday, Judge G. Koeltl released a strongly worded summary judgment ruling in the case about the Internet Archive’s “controlled digital lending” (CDL) initiatives. The legal theory behind this CDL is that a library can digitize a print book and loan out the digitized version so long as it sequesters the print copy. The Internet Archive (IA) did this on a global scale, enabling anyone with an email address to have a virtual library card, and then lifting what controls it had established in the early days of the pandemic. Publishers by and large do not accept this legal theory, at least as practiced by the Internet Archive, and several filed suit. In the case, Judge Koeltl ruled firmly against the Internet Archive’s assertions of fair use, writing that “What fair use does not allow… is the mass reproduction and distribution of complete copyrighted works in a way that does not transform those works and that creates directly competing substitutes for the originals. Because that is what IA has done with respect to the Works in Suit, its defense of fair use fails as a matter of law.” The Internet Archive has said that it will appeal the ruling.
In this post, you will find reflections from several of our Chefs. What are your thoughts?
I’ve always been very leery of library behaviors that we call “loaning” that in fact constitute creating copies. (I have yet to encounter an e-model for interlibrary loan that sounds to me like it’s entirely on the side of the copyright angels.) That said, until the Hachette v IA decision I had only a high-level understanding of the way that IA implemented CDL, and from what I understood, it actually sounded pretty okay to me; although it clearly involved the creation of new (and unauthorized) copies, it also seemed to involve taking the original copy out of circulation while the newly-created one circulated, thus avoiding the negative market effects of copy proliferation and more or less replicating the conditions that underlie the logic of the first sale doctrine. But after reading Judge Koeltl’s ruling, with its detailed explanation of how IA actually does CDL, I was left baffled as to how anyone might have thought that IA’s practices would stand up to serious legal scrutiny.
I was also startled to learn about the degree to which CDL leads to commercial transactions for IA, and thought the following language from Judge Koeltl’s ruling was particularly trenchant:
“IA exploits the Works in Suit without paying the customary price. IA uses its Website to attract new members, solicit donations, and bolster its standing in the library community. Better World Books also pays IA whenever a patron buys a used book from BWB after clicking on the ‘Purchase at Better World Books’ button that appears on the top of webpages for ebooks on the Website. IA receives these benefits as a direct result of offering the Publishers’ books in ebook form without obtaining a license.”
Furthermore, the judge quoted IA’s Director of Finance as testifying that “every single page of the Archive is monetized.” I’m no attorney, but I found the logic of Judge Koeltl’s ruling pretty persuasive. I’ll be very interested to see what happens on appeal.
Lisa Janicke Hinchliffe
Keep Calm and Carry On. Though over-memed, that’s my non-legal advice to librarians who have been carefully developing CDL programs of service these past many years in light of the summary judgment against the Internet Archive last Friday.
There is a real possibility that this judgment could have a chilling effect on library CDL services and it would be a real disappointment if libraries failed to distinguish clearly the nature of their offerings relative to the practices of the Internet Archive. We need to keep front of mind that this case is only what this case is about. The case isn’t about libraries being able to loan print books, to offer ebooks to patrons, to provide access to people with print-disabilities, to preserve cultural heritage, or any of the over-wrought claims one can easily find with a quick search of Twitter. I mean, sure, the Internet Archive has found itself facing a ruling it didn’t want and perhaps didn’t expect. But, no library that I know of has a CDL program of mass digitizing its entire collection and then loaning those digital items to anyone on the planet with an email account.
Exercising fair use — as an individual or an organization — is always a matter of risk assessment. Librarians know this. The Internet Archive presumably knows this too and went all-in on that risk. Librarians also know that Section 108 of the copyright law enables libraries to develop programs of service in ways not available to all other types of organizations. The Internet Archive, for example, only proffered a fair use (Section 107) defense of its CDL program and did not invoke Section 108.
Of course it will be responsible practice for a library to review the analysis underlying whatever CDL program that library has in place in place or is considering in light of this ruling; however, there’s no reason to panic or immediately halt a library’s CDL program just because the Internet Archive’s CDL program was found to be infringing.
It is sinful how good I feel that the Internet Archive has taken a drubbing in the courts. One could have wished for a broader decision, but the courts do what the courts do. But this is a small victory for the good guys, and we should all bask in that.
I don’t want to get into the legal issues involved, about which I am no expert. What I find so surprising is the failed promise of IA. At the outset it sounded like such a good idea, and the tech it brought to the world of content was truly amazing. But then, for some unknown reason IA determined to become an activist organization, with copyright as its target. This is akin to the medical practice of curing the disease by killing the patient.
What is so disturbing about this is how unnecessary it all was. All IA had to do was ask permission. That’s it; nothing more than that. Of course, had they done that, some publishers, perhaps all, would have declined to participate. And isn’t that exactly the point? IA had done nothing to entice the publishers to participate; it was a simple land grab.
Had IA asked permission, it would have initiated an interesting conversation. What would entice the publishers? Payment? Access to data? Digitizing copies and handing copies back to publishers for them to do with as they pleased? Opening a commercial site or a site that linked to a commercial site to enhance monetization? There are as many ways to structure deals as there are stars in the sky, but IA demonstrated no imagination and chose instead to take someone else’s property. Publications are not public goods.
There is much to unpack in this case for a lawyer, so I will limit my remarks to two concepts as they relate to “controlled digital lending” (CDL); (1) the importance of market harm in the fair use analysis, and (2) how statutory damages may apply to this and subsequent cases.
Fair use and market harm. CDL is a concept created and promoted by lawyers active in library lobbying and policy. In litigation, however, precedent is more important than theory and, despite advocacy efforts to the contrary, Defendants cited no direct precedent for characterizing CDL as fair use.
While fair use is fact dependent, one fact usually defeats a fair use defense; the existence of a market that is cannibalized by the use. This is particularly true in this case because, as Judge Koeltl noted, “it is clear that IA’s distribution of ebook copies of the Works in Suit without a license deprives the Publishers of revenues to which they are entitled as the copyright holders.” Indeed, even cases cited by Internet Archive show the strong correlation between market harm and infringement. For example, and to select only a few, we have Google Books (“Those arrangements allow or would have allowed public users to read substantial portions of the book, Such access would most likely constitute copyright infringement if not licensed by the rights holders.”), Redigi (“When a secondary use competes in the rightsholder’s market as an effective substitute for the original, it impedes the purpose of copyright”), and TvEyes (“The impact on potential licensing revenues is a proper subject for consideration in assessing the fourth factor.”). Judge Koeltl makes it clear that this remains the rule.
Damages. Moving on to damages, this next phase will be really interesting. The Copyright Act provides for statutory damages of $750 to $30,000 per work infringed, which can be increased by the court with a willfulness finding to $150,000 per work. Innocent infringement, which can reduce statutory damages, is not available when the work infringed has a copyright notice, as books typically do. Given the inability of Internet Archive to provide legal support for CDL, and given Judge Koeltl’s strong written opinion (combined with his apparent frustration during oral argument), willfulness can easily be found.
Here, summary judgment was granted for 127 books. At the relatively lower end, $10,000 in damages per work would be $1,270,000. At $30,000 – the high end of non-willful infringement — we have $3,800,000. Mid-range willfulness of $75,000 per work would be $9,575,000. These are big numbers, but they only relate to the 127 works and don’t account for the remaining 3.6 million in-copyright works made available by the Internet Archive, 70,000 of which were “loaned” on a daily basis. Having had its day in court and lost on summary judgment, should the owners of the millions of other works included in the Internet Archives CDL offering chose to sue it now, the path to a large recovery is much easier. Under certain legal concepts, these future lawsuits might not even require relitigating liability.
Lastly, with big damages looming, we need to ask who else has potential liability. In copyright cases, one can “pierce the corporate veil,” for example to find personal liability against individuals who were responsible for the infringement at a corporation or non-profit entity. The complaint in this case raised claims against “Does 1-5” and expressly disclaimed “any public, university, or academic libraries.” With statutory damages on the horizon, will the next plaintiffs claim against people who “donated” those plaintiffs’ books into the CDL scheme? And while publishers may be loath to sue libraries, there are plenty of record companies and film studios whose works were subject to CDL who might not feel the same way.
With the high level of damages that may be awarded against the Internet Archive and its fellow travelers, of course an appeal will be filed.
Mostly, the Hachette v. Internet Archive case makes me sad. As I wrote here back in 2020, the magnificence of Internet Archive’s contribution in making scarce, out of print work available is hard to overstate. And remains confoundingly at odds with what they asserted with CDL and the National Emergency Library, though I understand that its founders and team assert that it is a logical extension of their original vision.
Setting the legalities aside for a moment, it’s actually the community commitment that IA has claimed for itself here that is so illogical and that for me remains the heart of the issue. I admire and respect (and like) many of the folks who have vociferously argued IA’s case, on behalf of libraries and readers and the preservation of knowledge. And I admire and respect any argument on those grounds – libraries are one of humanity’s greatest collective endeavors – but I can’t get my head around how wrong-headed this is.
Should we have free access to any and all human creations, art, text, music, technology and otherwise? Well, sure. But there are so many things that society should be able to share freely – health care and clean water come immediately to mind – but we have not been able to organize ourselves collectively to do that and at the same time reasonably compensate the people on whose labor those products and services depend. We can no more easily make knowledge free than we can easily make water free, because it depends on a complex set of operations, some of them highly expert, to create and deliver it.
Roger C. Schonfeld
The oral arguments and decision in this case last week have had me reflecting on other major copyright cases in our sector during my career, including the Google Books case and the Georgia State case. The internet has enabled us to license materials more efficiently than ever before, for example as consumers through Kindle or as organizations through OverDrive. But digital formats have simultaneously empowered copyright holders to constrain the ability of libraries to perform traditional functions like lending and preservation.
In market segments where libraries have relatively more market power, for example scientific journals, libraries have successfully secured unlimited usage site-license models that enable far more liberal usage rights (as well as interlibrary sharing) than would have been possible from a single print subscription. But in other segments where their market power is minimal, for example trade books, streaming media, or trade publishing, libraries find themselves with far less ability to freely lend, or effectively preserve, the cultural record. State legislative efforts to shift the balance of power have not to date been successful.
Librarians are understandably frustrated by this state of affairs and have legitimate concerns about preservation and broad public access. They see too little support from the copyright industries in support of these societal goals.
With respect to the Internet Archive case, it is too soon to know how this saga will finally end. Based on the court’s judgment last week, which many even-handed experts had anticipated, we may have seen the foreclosure of many flavors of “controlled digital lending.” While an appeals court could change all this, the judgment is a reminder that, while copyright can perhaps in some circumstances evolve through challenges to the existing order, such an approach does not always yield sustainable innovation.
16 Thoughts on "The Internet Archive Loses on Controlled Digital Lending"
Although I’ve been watching from the sidelines, I haven’t come across a single lawyer who thought IA was going to win this case. They overreached and will pay the consequences for doing so. This might be a good thing, both in defining what companies can do with data (looking at you AI data models) and in coming up with new business models for publishers.
The actual solution to the issues Roger raises are to get back to discussing how to revise Section 108 to modernize it fully for the digital age, as the Copyright Office tried to do with its Section 108 study group in the 00’s…
Thank Mark. Is there a good post-mortem about what happened to that effort? I recall it seeming to be so promising at the time.
In 2017, the Copyright Office produced a report elaborating on the work in the 00s and reconsidered the situation. The link to the report is available here: https://www.copyright.gov/policy/section108/
Not entirely a post-mortem, but here is a post from the ARL covering the opposition that a number of library groups had at the time:
As the post reflects, the impact of reforming section 108 could go either very well or very poorly for libraries. I could safeguard traditional library practices in a digital form or it could gut legal protection for those very practices depending on which voices shape the reforms. Whether the opposition at the time was short-sighted or a correct evaluation of a lack a process transparency is hard to know.
Carrie Russell and I wrote a fairly pointed post-mortem a few years back, which I think helps illuminate the libraries’ hesitation to go back down the road of Section 108 revision: https://doi.org/10.17161/jcel.v1i2.6972. In a nutshell, Section 108 has always been seen by publishers as a tool for narrowing fair use, expanding copyright prerogatives, and reining in the perceived excesses of libraries, and since libraries have generally been aware of that strategy, they’ve resisted accordingly.
My take on this is that copyright law applies equally to all publications whether published 69 years ago or today. If this is incorrect, I’d like to learn why. If the Internet Archive had prevailed, any library and perhaps anyone could purchase a print copy of the best seller that appeared today and then create a digital copy. The library could then legally lend out this copy and avoid buying the higher-priced ebook version from the publisher. In other words, publishers would lose control of their ebooks.
When I asked this question at the legal session on the last day of an earlier pre-Covid Charleston Conference, I felt that I was told to shut up and sit down. I also did some quick-and-dirty searching in the Internet Archive and easily found books that were published in the last ten years and most likely still in print.
My final question is whether this is actually a new ruling. Quite a few years ago, I was curious about the fact that Yale University had digitized my dissertation to which I hold full copyright without my permission. My research then included asking the experts at a copyright conference who said that digital rights were separate and did not come with a purchase of a print copy.
Bob, I think what I don’t really understand here is what you mean when you say that publishers could “lose control” of their ebooks. In one sense, that was the normal state of affairs in the print world, where publishers sell a book to a library for an fixed transparent price and the library can then lend it out, or resell it to a third party, as it sees fit. Many libraries don’t understand why this basic framework shouldn’t be made to apply to ebooks as well. (And certainly some librarians and technologists and advocates have wanted far more than what exists in the print world.)
Of course, equally understandable, many publishers don’t want to “lose control” further than this through unlimited lending that eliminates the technical constraints of print, such as a geographical location and the one-loan-at-a-time model. Unfortunately these concerns have resulted in licensing models that are frequently overprotective relative to the print environment, monetizing the lending model and in many fields making preservation all but impossible.
From a policy perspective, it seems like there should be quite a bit of space here for compromise now that we are hopefully past the confusion of the new technology.
I completely agree with the fact above. Publishers quickly saw digital resources as a way to take back control of the publication process by licensing rather than selling ebooks. They couldn’t do the same for print materials because the doctrine of first sale gave libraries options to purchase the print materials elsewhere even if publishers didn’t really want to sell to libraries. In addition, print materials provided the possibility of additional sales that ebooks didn’t. Print wore out and was stolen so that libraries had to often purchase replacement copies. These possibilities for extra sales disappeared with ebooks. Publishers and sellers like Amazon determined that leasing ebooks rather than selling them was a much better strategy for their bottom lines.
We live in corporate America where corporations and their owners/leaders get rewarded for increasing profits. They saw a new legal opportunity and took advantage of it. I’m commenting on why what the Internet Archive was doing was illegal according to copyright law. I’m not saying that I’m in favor of the decision but I can understand why publishers would want to stop any lending that could create a precedent that would lead to a decline in profits. I will conclude my comments on this issue by observing that libraries in the current world probably need publishers as content providers more than publishers need libraries.
“We live in corporate America where corporations and their owners/leaders get rewarded for increasing profits. They saw a new legal opportunity and took advantage of it.”
I appreciate intellectual honesty that states facts as they are and then moves on from there. I would only add shareholders, ETF holders, Index Fund investors, etc. to Bob’s list of those who expect rewards — which widens the net considerably.
I also agree with Bob’s last statement but would add independent authors (those choosing not to use a publisher for their novel or research, as the case may be) to that list.
Todd gave the link I was going to suggest for Roger, a report issued by the Copyright Office in 2017 with a draft Model code— it’s also quite true as the CO report notes that library associations had taken the position that it was not necessary to revise Section 108— this was at least in part because there was a belief that a more “flexible” approach through a Fair Use finding was a better alternative. In my view the IA trial court decision illustrates the flaws in relying on FU— inherently a decision based on the “facts at hand”. Revising Section 108 doesn’t mean giving up Section 107 (FU)– in fact the current section incorporates this directly– and the Model code would do so as well. Making digital copies for users was a key difficulty in the discussions— the Model code would permit it for scholarly use for articles and parts of longer works— and the entirety of works if an e-book wasn’t already available at a “fair” price… not inconsistent with how Section 108 works now for print works.
So the fair use argument did not help Internet Archive. The next question will be whether it will be of help to (Open)AI. For their reasoning is: Quote: “Legal uncertainty on the copyright implications of training AI systems imposes substantial costs on AI developers and so should be authoritatively resolved. (…) For this response, we draw on our experience in developing cutting-edge technical AI systems, including by the use of large, publicly available datasets that include copyrighted works. (…) Modern AI systems require large amounts of data. For certain tasks, that data is derived from existing publicly accessible “corpora” (singular: “corpus”) of data that include copyrighted works. (…) We submit that proper application of fair use factors requires a finding of fair use, especially considering the highly transformative nature of training AI systems. This conclusion is strengthened by reference to existing analogous case law holding that the reproduction of copyrighted works as one step in the process of computational data analysis is a fair use of those works. (…) Training of AI systems is clearly highly transformative. (…) Intermediate copying of works in training AI systems is, by contrast, “non-expressive”: the copying helps computer programs learn the patterns inherent in human-generated media. The aim of this process—creation of a useful generative AI system—is quite different than the original object of human consumption. The output is different too: nobody looking to read a specific webpage contained in the corpus used to train an AI system can do so by studying the AI system or its outputs. The new purpose and expression are thus both highly transformative. (…) Training AI systems should not, by itself, harm the market for or value of copyrighted works in training corpora. Since such corpora are consumed by machines, not humans, the authors should lose no potential audience due to the use of their works in the corpus itself. (…) We thus submit that use of copyrighted works in training AI systems is squarely in line with these and other “non-expressive” fair use cases. We therefore expect future courts to straightforwardly deem any challenged training to be non-expressive fair use. (…) Holding That Training AI Systems is Infringement Would Severely Hinder Creative AI Research, Thus Stifling the Very Creativity Copyright is Supposed to Promote. (…) The fair use doctrine “’permits courts to avoid rigid application of the copyright statute when, on occasion, it would stifle the very creativity which that law is designed to foster.’” AI systems hold immense promise for both creative expression and general economic innovation. Copyright barriers to training AI systems would have “disastrous ramifications” and “could jeopardize the technology’s social value, or drive innovation to a foreign jurisdiction with relaxed copyright constraints.” We thus submit that such barriers would “stifle the very creativity which [copyright] law is designed to foster” and retard “the Progress of Science and useful Arts.“” https://www.uspto.gov/sites/default/files/documents/OpenAI_RFC-84-FR-58141.pdf
Emanuel- Thanks for your comment. I debated discussing this issue in my contribution above but though it might unduly complicate my text. This SK post is relevant to the question: https://scholarlykitchen.sspnet.org/2023/03/07/some-thoughts-on-five-pending-ai-litigations-avoiding-squirrels-and-other-ai-distractions/. In a case involving generative AI, US courts will look to market harm and license availability (in addition to the other fair use factors), EU courts will look to whether the rightsholder reserved its rights to TDM for commercial use, and UK courts will probably have to conclude it is infringing given UK law explicitly has no commercial exception– although I am not a UK lawyer so perhaps I am missing something.
I’d like to ask these distinguished chefs how they view the effect of actual or potential market harm (factor 4 of Sec. 107) as a limit on the extension of fair use to justify library practices. Roy has addressed this question directly, and I can assume what Joe thinks about this issue, so I am particularly interested in responses from the librarians. To start, I’d like to remind everyone that when I testified before Congress in 1973 in a Senate subcommittee hearing about fair use, I represented the view of university presses that the fourth factor should be the preeminent determinant of fair use, with the other three factors assigned a subsidiary and less important role in determining when a use is fair. That recommendation, had it been adopted, would have made the application of fair use much more straightforward and clear.
Early important cases, like the publishers’s suit against Texaco for photocopying (overseen by the Association for Copyright Enforcement on whose board I served along with the presidents of the CCC and AAP) put market impact front and center and gave the plaintiffs a straightforward victory. Judge Pierre Leval, who apparently got tired of being reversed in the Second Circuit, came up with the idea of “transformative use” in his now classic Harvard Law Review article “Toward a Theory of Fair Use” in 1990, which began to become prominent in copyright jurisprudence with the Supreme Court decision of 1994 about the 2 Live Crew parody of the famous song “Pretty Woman” in Campbell v Acuff Rose. This concept then grew in importance relative to market harm through a series of cases especially in the Ninth Circuit until it claimed a preeminent position in the judicial interpretation of copyright. A subtle change occurred along the way, however, in what “transformative” meant. In the 1994 case, reflecting Leval’s original concept, a use was transformative only when the user engaged in some additional act of creativity that, in effect, produced a new work itself protectable by copyright. But the cases in the Ninth Circuit and others following those precedents, like the HathiTrust case, began to substitute the action of algorithms, not real human creativity, as sufficient to “transform” a work (like producing an index of artworks using thumbnail images), and the idea of “repurposing” copyrighted works gained ground. Emanuel’s post here perfectly illustrates how “transformative” changed its meaning from referring to creativity in the act of transformative use to the application of computer technology to facilitating LATER creative uses. That transition is where the concept ran off the rails.
Eventually, this evolution of the concept reached what I consider its reductio ad absurdum in the ARL’s Code of Best Practices in Fair Use for Academic and Research Libraries (2012), which accepted Jonathan Band’s idea that works like novels, poems, monographs, scholarly journal articles, and the like, not being aimed by their authors at students in college classrooms, were when used in that context being “repurposed” in such a way as to be able to be reproduced by libraries for classroom use without needing permission from their authors or publishers or requiring payment for such use. The harm to the markets for these works is patently obvious.
So, I ask our librarians if they think this is a step too far, a step even beyond what market the Internet Archive has been found guilty of infringing. Presumably, the libraries for which they work have accepted this Code as a credible interpretation of copyright law.
P.S. For a fuller explication of this history of fair use, see my article “Reflections on Copyright Law and Scholarly Publishing over Fifty Years” (2018) freely available at this library site: https://scholarsphere.psu.edu/resources/460d0813-f82b-400a-abc8-137bf9d1f647.