The Shrinking Orphan Works Problem

Little Orphan Annie — Image via Wikipedia

I have written about the orphan works problem before and don’t intend to repeat myself. “’Orphan works’ is a term used to describe the situation in which the owner of a copyrighted work cannot be identified and located by someone who wishes to make use of the work in a manner that requires permission of the copyright owner.” This leaves people who would like to reuse such works in copyright limbo. In the US, most books published before 1923 are in the public domain; by definition, none of these are orphans because you don’t need permission to use these works in any way you want to. After 1923, though, things get murky.

There are thousands or tens of thousands or hundreds of thousands or millions of orphan works; no one really knows the number. I heard someone say that there are “tens of millions” of orphans, an impossible number, since the number of orphans cannot exceed the number of books published — which, by the way, is a much smaller number than you would think. Whatever the total number of orphans, it is getting smaller all the time. As a mnemonic, sing that line to the Beatles’ “It’s Getting Better”:

It’s getting smaller all the time.

The reason to commit this fact to memory is that orphan works play too large a role in people’s thinking about books and publishing. The orphan works problem is an artifact of some particular 20th-century practices, which are already more than a decade behind us. The number of orphan works is shrinking because books that could have been orphans are being researched and their copyright owners identified — or they are being found to be in the public domain. And the number of new books being sent to the orphanage is not growing and won’t grow, despite the fact that more books are being published than ever. Yes, more books are being published — the publishing industry continues to defy all expectations and is proving to be powerfully creative, competition from alternative media and the disruptive nature of digital technology notwithstanding.

The apparent goal of any orphan works program is to shrink the size of the pool by determining each book’s precise copyright status. That makes sense: Does anyone benefit from not knowing the owner of the rights to a book? In practice, though, the motivations are more complicated. Many people investigating orphans do so with the hope that they will discover large numbers of books whose copyrights have never been renewed; hence the research could lead to an enlargement of the public domain. And it will. Many copyrights were never renewed, a fact perceived decades ago by my former boss, the late Hayward Cirker, founder of Dover Publications, who built a handsome business by researching copyrights and reprinting books that had fallen out of copyright. Others in the orphan works game are looking to expand usage rights for books whose orphan status cannot be resolved. This gets you into the world of setting policy, and if you are not careful, you could wake up in a room full of lawyers. Still others (mostly publishers) are interested in finding the copyright holders and bringing these books back into the marketplace in any of the new low-cost methods (digital downloads, POD, etc.). With so many interests nibbling at the carcass of orphan works, at some point there will be little left but bones.

There are four main reasons that the pool of orphans will continue to shrink:

First, as a matter of law, at some point the cut-off date of 1923 for U.S. copyrights will move forward, depending on what kind of lobbying the copyright industries do in Washington. None of this is straightforward, as a glance at Peter Hirtle’s table summarazing copyright status shows. But the public domain will continue to grow, and the number of orphans from the second quarter of the 20th century will decline. It’s hard to know, though, what could happen with legislation on this head. As a practical matter (meaning matters that concern money), the date of 1927 is key to lobbying, as that is the year when sound was added to movies. Movies with sound still earn large sums of money, so you can expect that Hollywood will try to draw the line as early as possible. (And what, you ask, does Al Jolson have to do with scholarly communications? Everything, as all of scholarly communications operates on the infrastructure of consumer media, the most stubborn strategic fact of this industry. Think about this when you type a question about Erasmus into Google or browse through Medline on an iPhone.)

A second reason that the number of orphans will dwindle is the work being done in the library world, with the HathiTrust leading the way. Library collections are now being sifted to determine which books are under copyright, which are not. Although press attention to this research tends to take the view that the number of orphans is very large, for my argument it does not matter how many orphans there are; all that matters is that the number is shrinking. The more work librarians put into this, the more copyright owners will be identified.

More interesting to me, however, are the efforts by publishers to reclaim their “lost” works. They may do this as a public service (to identify what reuses can be performed with each title), but mostly they are doing this in order to prevent activities like those of HathiTrust from intruding on what the publishers feel is the field of their own prerogatives. One university press director told me that she had invested in the digitization of virtually everything her press had ever published, skipping over only those titles without a chance of ever finding a customer–and her view was that she had digitized far too many. Especially interesting is the recent announcement by Springer that the company was planning to digitize thousands of backlist titles, including many that are clearly in the public domain. Publishers are beginning to take control of some aspects of the digitization process and the ongoing curation of even books with little or no economic value. Of course, they would not have done this without the spur provided by HathiTrust or the jolt that Google’s mass digitization project brought to the entire industry. These publishers will be whittling away at the number of orphan works over the next few years. Ironically, some books that are being digitized by libraries as orphans are likely to slip back to the control of publishers once the publishers complete their own copyright searches.

So the historical pool of orphans will in time be defined and slowly diminished. It won’t disappear, but it will get smaller.

One would tend to think that with more books being published than ever before, the number of orphans would soar. But if you think about the practices of a publisher today, it’s hard to see how new books could lose their parents or guardians. Today’s publisher has some kind of digital workflow in place, whether the final output is a print book or an ebook. That workflow varies in sophistication, from “XML first” implementations to processes where digital masters are tacked on after the print workflow is completed, but the common denominator is that all books published today have digital copies, which can easily be placed on a server for ebook distribution or linked to a POD service for customers who (still) prefer print. Once you have a digital copy in hand, the cost of keeping that copy on a server and fulfilling orders, whether print or digital, is trivial. This does not mean that the digital version can be displayed on all devices or that it is easily searchable (but we are getting there), but this is irrelevant for the orphan works problem. If a publisher is offering a book for sale in some form, then that book almost assuredly is not an orphan.

Another thing that militates against the growth of the orphan works pool is the large claims being made for the value of these works. Let’s say a university library has a book in its collection that exists only in print. It has not circulated in years; in fact, it may never have circulated at that particular institution. Although the book’s metadata is online, when the book is digitized, full-text search becomes possible. This means that more people may discover the book, potentially driving up the usage in digital form of books that were dormant in print. Some librarians believe that this will indeed happen: full-text search will make orphans into vibrant elements of a digital collection. I am less sure of this. Full-text search will surely increase the number of people who discover a book, but it’s the conversion rate that matters: how many more people will read it?

This is one of those “be careful what you wish for” situations. Books are orphans almost always because there is no market demand for them. While these titles may have value for scholars pursuing specialized studies, the economics of the situation typically provides no motivation for a publisher to research the copyright and make the book available once again. Full-text search, on the other hand, may do a publisher’s market research for it: the title that was ignored in print as having no economic value suddenly emerges as a potentially profitable niche publication in digital form. In fact, the better the discovery technology, the more likely that publishers will make claims about the ownership of these works. In effect, some of the effort now going into digitization and copyright research in libraries will come to fatten publishers’ bottom lines.

The fact that the orphan works problem is shrinking does not mean that advocates of digitization will switch their attention elsewhere. Paraphrasing Chekhov, a scanner sitting before a wall in the first act will be put to use scanning orphan works in the third. We have these tools, so we will use them; if we chew bubble gum, we will blow bubbles. It would be unfortunate, however, if the utility and convenience of digital tools (not to mention the astonishing number of funding sources) distracts us from tackling matters of greater import.

The orphan works problem will shrink for another reason as well, not only in terms of the number of books designated as orphans but also because our collective attention will rightly focus on new books. The “lost” books of the past are a fraction of the books that we will be creating in the years ahead. The orphan problem will thus shrink as a matter of proportion and priorities. It’s a short-term problem — and that’s okay because, as Keynes said, in the long run we are all dead. But on this point the Beatles had it wrong:

You and I have memories
Longer than the road that stretches out ahead

Speaking collectively, this is not so. The road ahead is much longer than the one we have traversed. Looking backward is, well, looking backward.

Joseph Esposito

Joe Esposito is a management consultant for the publishing and digital services industries. Joe focuses on organizational strategy and new business development. He is active in both the for-profit and not-for-profit areas.

Discussion

4 Thoughts on "The Shrinking Orphan Works Problem"

Hmmm.Perhaps you’re conflating orphan works with books? Even with books, there are many more possible candidates for orphans than you are giving credit for (eg authors alive but unable to consent, multiple authors not all of whom can be contacted, etc). And I don’t think you should under-estimate future authors and copyright holders continuing ability to mislay their “paperwork”!

By Chris Rusbridge
Oct 18, 2011, 12:17 PM

I was in fact thinking only of books. I would imagine for non-book materials, the orphan works problem is more complicated.

By Joseph Esposito
Oct 18, 2011, 1:05 PM

Great post, Joe. I want to respond to one brief but significant point you make about digital access to low-demand books in academic libraries:

Full-text search will surely increase the number of people who discover a book, but it’s the conversion rate that matters: how many more people will read it?

I think you may be setting too high a bar for “conversion” here. In research libraries, relatively few books get “read” in the usual sense (i.e., in a linear, sustained manner). What happens much more frequently is an activity I call “interrogation” — books are searched for relevant and useful chunks of information. When you see a student sitting at a library table surrounded by piles of books (granted, no longer a very common sight in most libraries), it’s not because he’s reading all those books from cover to cover; it’s because he’s interrogating them. This kind of use occupies a sort of middle ground between simple discovery and extended reading, and is both enormously important in research libraries and greatly facilitated by mass digitization. I think that’s one of the major reasons librarians get so excited about mass digitization. It’s not that we necessarily believe it will lead to more books being read in their entirety, but rather that we think it will lead to more useful information being discovered and used by our students and researchers, and even relatively low-interest books can offer lots of value when they’re fully interrogable.

By Rick Anderson
Oct 19, 2011, 10:49 AM

I think Joe is only partly right: the pool of already published orphan works in print will diminish over time. However, since self-publishing is the fastest growing part of the e-book industry, we will likely find more, not fewer orphan works in the future because individual authors are notoriously difficult to track down unless they have some kind of institutional affiliation (to a publishing house or university, for example). One major finding of the recently concluded ARROW project on orphan works was this: *The type of publisher had a large impact on whether works were orphaned, with self-published works accounting for 51% of all orphan works in the study.” I would also argue that it is unlikely orphan works will ever disappear entirely because, among other reasons, publishers whose contracts did not give them digital rights are finding it sometimes difficult to get those rights assigned for books they published many years ago that they’d like to reissue as e-books. Finally, on digitization, our experience at Penn State Press with publishing public-domain books from Penn State Library’s collection of books about Pennsylvania (called Metalmark Books) shows that the market for some of these as POD editions can be fairly significant, sometimes breaking into three figures, even though these books are also available free for reading online. (It takes a sale of about ten copies to recover the costs of digitizing.)

By Sandy Thatcher
Oct 23, 2011, 3:38 PM

The Scholarly Kitchen

Joseph Esposito

Discussion

Cautious Optimism, Uneven Readiness: Insights from SSP’s Pulse Check

President’s Letter | December 2025

The Price of Connection: Supporting SSP’s Mission in a Changing Economy

Joseph Esposito

Related Articles:

Next Article: