I can remember exactly what I was doing on December 14, 2004. That was the day that Google announced its Library Project (soon to be known as Google Book Search), and the information community was buzzing about the significance. I spent the day synthesizing news accounts of what the project entailed and stayed later than usual in the office, waiting for a final clarifying detail to come from a librarian at Stanford before hitting the “send” button on a post to the NFAIS listserv.
In 2004, there was still some uncertainty as to which of the various internet giants was going to be dominant. Amazon had announced earlier in the year that its A9 search tool had emerged from its beta phase. The integration of the A9 search tool with the Amazon website allowed students to rapidly uncover content from digitized versions of book titles, stored by Amazon for the purposes of encouraging sales, while also retrieving usefully relevant web content. For example, if an undergraduate used the system to research the notorious Civil War prisons Andersonville and Elmira, results returned could include a snippet from a published title (as part of Amazon’s Search Inside the Book feature) alongside relevant web-based content supplied by the National Park Service. Engineered by Udi Manber, A9 was viewed as an exciting innovation in surfacing and selling print products.
Meanwhile, Google had only just gone public with an IPO in 2004. That year, at the Frankfurt Book Fair, Google announced its Publisher Program, which promised to support the same type of search functionality. Publishers willingly signed up, unaware that the Library Project would be announced two months later. The Library Project was ambitious, digitizing titles acquired for collections held at Harvard, Stanford, the University of Michigan, the Bodleian Library at Oxford University, and the New York Public Library. This was a breathtaking step farther than Amazon, and the information community was thunderstruck as it tried to process the implications of what such an expansion could mean.
This is the story that is told in Along Came Google: A History of Library Digitization by Deana Marcum and Roger Schonfeld (full disclosure, Roger is a regular contributor to this blog). Note the subtitle. This book documents from a library perspective the implications and long-term impact of Google’s move to make a significant corpus of “offline content searchable online” through optimized means of scanning and digitization. The outcome of Google’s ambitious project would ultimately be diminished, due to constraints resulting from extended legal battles, but key library leadership has managed to create the infrastructure needed to sustain and carry on the massive digitization needed. There were significant barriers to that work, as the authors note, despite the fact that “in this story, there are many actors, all of good intentions. Inevitably, it is also a story of limitations and failures to collaborate.”
The book asks key questions in its conclusions regarding whether it is possible to create a truly national digital collection for the United States as a whole. Can the information community in this country successfully collaborate in developing “a single coordinated program to provide digital access for the entire historical and cultural record that is easy to use and ubiquitously accessible”? If not, are we at least able to sustain one that is “the accumulation of many efforts, all of them incomplete, controlled by an array of different actors”?
That we are more likely defaulting to the second (less satisfactory) result becomes clear as one reads of the foundational efforts behind HathiTrust, the Internet Archive, JSTOR, and a host of other partial aggregations of digital texts. Reading Along Came Google will make clear the tangle of competing interests with which libraries had to wrestle and the priorities that sometimes helped and sometimes stymied the creation of digital resources now taken for granted by students, scholars, and writers. The HathiTrust cooperative is such a case in point. The continued expansion of access to materials held in that collection represents a striking value, and their response to the need for access during the pandemic was well thought out. (Recent work on the reading interface has improved that user experience, as well.)
Ordinary mainstream readers, those not familiar with the information industry, are unlikely to ever recognize the names and efforts of Paul Courant, Wendy Lougee, Brewster Kahle, and others. Through interviews with these key figures, Marcum and Schonfeld convey the past work of these individuals, as well as their sense of where we are now.
One such quote appears on page 157:
Paul Courant of Michigan is not ready to give up, even though he says “mass digitization is dead.” He adds, “We are back in that place where we do deals with various entities that want to digitize things, where we allow them to give us copies in a few years. We get the miscellaneous grant from Mellon to digitize. We have done well with hidden collections. But all the holes didn’t get filled. We still don’t have good solutions for preservation of current stuff. Now we are in Zeno’s paradox. Nobody wants to pay for it.”
That comment does not seem to promise great things for a national digital collection. It does sum up, however, the biggest ongoing issue faced by libraries. Melding together the wealth of print and digital collections for the public good is an expensive process, and libraries at all levels operate under fluctuating funding commitments.
After reading Along Came Google, I was reminded of Deanna Marcum’s 2016 Miles Conrad Lecture, in which she spoke of the need for digital leadership in libraries:
“Digital leaders are distinguished from non-leaders by their different combinations of skills, attitudes, knowledge, and their professional and personal experiences. Leadership must be driven by unique attitudes appropriate for the distributed, digital age. Digital leaders must be flexible and adaptable and possess wide intellectual curiosity and a hunger for new knowledge. They must be willing to see value in sharply different perspectives and be comfortable with uncertainty, and like all leaders of all times, must possess true passion for what they do. They look globally for solutions and challenges and also hunger for constant learning. They maintain a more egalitarian and results-oriented approach than the leaders who come before them.”
“The reason we need to concern ourselves with defining digital leadership is that libraries are in a pivotal moment, and a digital mindset is needed at every level of the organization. The utilization of digital technology in making research and teaching and learning easier and more efficient for those they serve is critical. Libraries’ very survival depends upon making the transition from a local institution to a node in a national and international information ecosystem. The skills needed to build a local collection are not sufficient for seeing the challenges and opportunities in a global environment.”
I also like a brief reference included in Marcum’s lecture—a comment made by a librarian at the time associated with the University of Ghent. “I am a humble librarian; I became humble when I saw what Google could do. And very simply, what Google has done is make information easily accessible. The local library is no longer a collection, but a set of services that connects the user to all information everywhere.”
Along Came Google: A History of Library Digitization does not pretend to answer the question of whether we will ever manage to bring together a cohesive national collection. The authors realize that the impact of the events following Google’s foray may not be properly understood for years to come. They simply document the library community’s story of disruption and adaptation through to the present. Throughout the pandemic, libraries of all sorts have been challenged to maintain service levels in delivering needed materials to students and scholars. We should be cheering those disruptive digital leaders who have reason to expect more and better from providers in assisting them to connect a global community of users to all information everywhere.
1 Thought on "Book Review — Along Came Google: A History of Library Digitization"
I’m gobsmacked. It’s not clear from this post what the position of the book (along came Google) and its authors is, but the position of the author of the post is clear: “We should be cheering those disruptive digital leaders who have reason to expect more and better from providers in assisting them to connect a global community of users to all information everywhere.”
In otherwords, the book pirate’s Utopia: one digital copy of every book, accessible by anyone anywhere on the planet, free of charge. Am I the only one who sees a problem here?
Where to begin? Perhaps a somewhat unconventional place: let’s look at some of the people named here as “disruptive digital leaders.” I confess I am not familiar with the work of Paul Courant and Wendy Lougee, but they are or were institutional librarians working, presumably, for better digital access of a specific collection for a specific group of library patrons.
The third person named, Brewster Kahle, is another kettle of fish entirely. As far as I can see, this individual has no respect for intellectual property and dreams, through the Internet Archive, of a vast global library serving billions of people based on only one copy of each book in it — because, of course, knowledge should be free, the mantra of the hour.
And what has Brewster Kahle done? Well, one thing he has done is made a fortune selling his . . . intellectual property. Wikipedia tells us that
In 1992, he co-founded, with Bruce Gilliat, WAIS, Inc. (sold to AOL in 1995 for $15 million), and, in 1996, Alexa Internet (sold to Amazon.com in 1999). At the same time as he started Alexa, he founded the Internet Archive, which he continues to direct.
I guess there are two kinds of intellectual property in this world: that which one can sell for a fortune, and that which should be liberated and made free to all, a task he has gladly taken on:
“For the cost of 60 miles of highway, we can have a 10 million-book digital library available to a generation that is growing up reading on-screen. Our job is to put the best works of humankind within reach of that generation. Through a simple Web search, a student researching the life of John F. Kennedy should be able to find books from many libraries, and many booksellers—and not be limited to one private library whose titles are available for a fee, controlled by a corporation that can dictate what we are allowed to read.”
Sounds great, for everyone except those who write and publish those books. He’s right: technology has now made this possible. The Internet Archive and all kinds of book pirating sites are busy making it happen. But just because technology makes something possible, should that thing, and that thing only (without, for example, compensation for authors and publishers) actually happen? I don’t know if the book discussed here addresses that. The author of this post, in her gushing comments about intellectual property hypocrites like Brewster Kahle — watch me get rich while I digitize your book – appears to be all in favour. And no one in this community reading this has made a peep.