The Google book project is the neverending story. In the past week we have two new chapters: the Authors Guild filed a new lawsuit (significantly, without the publishers), whose aim appears to be to stop the HathiTrust from digitizing orphan works, and the publishers and Google told the judge at a hearing that they are continuing to negotiate a settlement but need more time. It’s now conceivable that by the time there is a resolution to these conflicts, all of the orphans under debate will have fallen into the public domain. Echoes of Jarndyce v. Jarndyce!
I am not going to rehearse the ins and outs of these legal steps, not all of which I pretend to understand, beyond recommending that people interested in this topic keep a close watch on James Grimmelmann’s Laboratorium blog, which somehow manages to present legal intricacies in plain English. I am myself a skeptic about the value of mass digitization, which puts me in a minority. From my point of view, this is a battle like the one described in Hamlet:
We go to gain a little patch of ground
That hath in it no profit but the name.
What I keep wondering is how all this came to be. It is not, as many believe, that the unbounded greed of publishers (and now authors) has driven everyone to the lawyers — it’s not greed because the orphans have no commercial value. This is a battle over principles, which are always the bloodiest.
How I long for a strategy built on foresight, preemption, and cooptation! There is an alternative universe where publishers saw that mass digitization projects were bound to come. They did indeed have decades of warning. The late Brett Butler, cofounder (1976) of Interactive Access Corporation, which lives on today as a service within the Gale/Cengage empire, told me about making the rounds to magazine publishers to acquire digital rights and of talking from the outset of the vision of eventually adding all other content types. That was over 30 years ago. Or there was Dialog and, later, ProQuest, JSTOR, and EBSCO, all of which moved analogue materials into searchable digital dabatases. For books there were NetLibrary, eBrary, Questia, and undoubtedly other companies whose names I cannot recall. And if these operations didn’t tip off publishers that sooner or later, there was going to be a call to digitize everything, attendance at any large library meeting would have exposed them to the librarians’ digital dreams.
It took Google to get this going, and it shouldn’t have. Publishers could have taken the lead with tightly focused projects; they could have marked themselves as innovators instead of litigators; they could have probed the technology and economics of digitization at a time when all this was under their control. They would not be fighting a rearguard action today, hoping to stuff the genie back into the bottle, praying for the retention of copyright. Incidentally, there was in fact an online service called GEnie (General Electric Network for Information Exchange), which launched in 1985. Litigation is what happens in the absence of foresight.
I am not making these remarks because I take the side of opponents of copyright. The point here is that copyright arguments are what come about when a business has not staked out ground early. A series of digitization projects, with publishers working hand in hand with libraries, would have encouraged Google to direct its voracious appetite elsewhere. (Think of how differently Google handled the journals in Google Scholar and the books in the mass digitization project.) A little bit of R&D money could have warded off a fearsome rival and may have yielded new revenue opportunities. A show of hands, please: How many publishers have R&D budgets?
One wonders as well whether publishers could have taken steps early in the game to preempt the Open Access movement. If a digital preprint service were already in place, would Paul Ginsparg have started arXiv? What kind of attention would open access have gotten from governmental and foundation officials if something along the lines of PLoS ONE was up and operating — under the control of the commercial publishers that have the most skin in the game? Failure to innovate leaves a space open for others, and when that space gets filled, it is not often with the publishers’ interests at heart.
There are several areas now where pressure is building for innovation. To pick just three:
- Ebooks sold on a subscription basis directly to consumers. I have been writing about this for some time, but was sad to see that Amazon and not publishers was taking the lead here. This particular battle is not over.
- Information services for research material marketed to independent knowledge workers. I am one such knowledge worker, who, without a university affiliation, struggles to get access to publications that I need. Open access advocates are trying to solve my problem, but this is ironic in that I would happily pay for such a service. Why won’t anybody take my money?
- “Overlay” services that organize information in a clear and accessible way. I will avoid providing a plug for any such service in the scholarly communications area, but you can get the idea by looking at Mint.com, a personal finance manager that aggregates all your bank and credit card accounts. One can imagine such a service, with its splendid navigation tools, sitting atop the infomation resources of a library collection.
If I were a publisher, I would not want Amazon to set up a “Netflix for books”; I would want to control such a service myself. I would not want a third party to control consumer access to my publications (unless I owned a piece of the joint venture), nor would I want the primary interface used by researchers, the window through which all content is organized and consumed, to be in the hands of a separate organization. But all these things can and will happen if publishers don’t plant a flag in the ground first.
Which brings me back to HathiTrust. I just attended a conference in which orphan works were the topic of discussion. What problem is HT solving? Some estimates suggest that 40% of books in academic library collections do not circulate. Even if that number is high, and even if that number would shrink further if full-text search were made possible, the fact is that HT is hardly looking to make commercially viable works openly available — because books become orphans for a reason, and that reason is that they don’t sell. For HT to make obscure academic titles from 50 years ago available to the specialists for whom such works are essential, a lot of eggs may get broken, which could include significant reinterpretations of copyright law, a dramatic expansion in the scope of fair use, the extension of the “first sale” doctrine to digital products, and perhaps mandatory licensing (as in the music business). This did not have to be. By not having an early vision of life-cycle publishing, publishers now face assaults on multiple fronts.
Publishers become their own worst enemy when they ask what is the market for a particular product. You can’t ask that question for new ventures; it only has a meaningful answer when a market has already been established. There is another step, an earlier step, in the process, and this is where publishers have been negligent. What kind of new capability can I create? If the capability has inherent interest, I will figure out a business model and create a new market for it later. This was ably summed up by Tim O’Reilly when he proposed that the goal is to create more value than you capture.