I recently attended the Midwinter ALA conference in Dallas. The longest line was at Starbucks.
What brought me to Dallas was an invitation to participate in a panel sponsored by Sage on discovery. Sage had commissioned a white paper on the topic, which you can find here. It’s a good paper, which I recommend to one and all. The authors of the paper were on the panel as well and spoke authoritatively to the question of how libraries are implementing new means of discovery. I had not realized how complicated the situation was. What was clear to me, though, is that librarians and publishers alike have an interest in improving finding mechanisms: librarians because they want their collections to be used, publishers because such usage translates into a stronger brand, which can help in making the case to purchase the next product.
I should note that discovery is one of two words in scholarly communications (the other is curation) that leave me not entirely satisfied. This has to do with where you imagine yourself sitting in the value chain. My bias is on the publishing side, where capital is injected into the system. For a publisher, the better terms are marketing and editing. My definition of discovery is what happens after the marketing is successful. Discovery is about finding things, but marketing is about coaxing people into finding the things that you want them to. You discover things in many ways; for example, you might type some keywords into Google. Marketing is what put those words into your head in the first place. I Googled the Web to find out the schedule of the latest episode of Downton Abbey, but I went looking for that schedule because of the immense and brilliant media blitz PBS brought to that property. Discovery is indeed a librarian’s concern; it’s what you do after the product has already been purchased.
My beef with curation is of the same order. Editing is a creative act. Editors work with authors, identifying what is good, breakthrough, and promising, and assisting in the shaping of the work. Curation is, like discovery, an after-the-fact activity: once you have something in your collection, how do you care for it? There is nothing wrong, and everything right (I am trying to forestall all the nasty comments), about discovery and curation, but they represent a library-centric way of looking at things. We need libraries, we need bigger libraries, and we need more libraries. And, by the way, we also need more librarians. A publisher-centric way of viewing the world, in contrast, would lead us to declare that we need editorial selection, that we must reject many more works than we publish (hurrah!), that we will stick our necks out with a capital investment to back the works we selected, and that in order to earn that investment back and then some, we need to bring the works to everyone’s attention. All eyes turn to the trembling head of marketing.
There is an elephant in the room — or perhaps just a multi-volume encyclopedia — when we talk about discovery, and that is that more and more libraries are reporting that over half of all searches on their collections are coming from Web-scale search engines, primarily Google and Google Scholar. So the issue of discovery inside the library is also a matter of discovery outside the library. This is where things get complicated. Publishers, seeking the attentions of Google, drop a handkerchief in the form of metadata and sample text on the open Web, but if the purchased content resides in a scattered manner among the world’s libraries, how best to direct a user to the place where he or she has unfettered access? There are mechanisms for this, but cooperation among publishers and librarians is essential to ensure that users leave the table satisfied that rich content has been served.
All this has put new pressure on creating stronger discovery engines within the library context–not to compete with Google on the open Web, but to point users precisely at the content to which they have authorized access. It may be that Google will still be the predominant agent of discovery, but local search tools have the advantage of only showing users what is theirs to view. A representative from Summon, a service of Serials Solutions, participated on the ALA panel, but there was also mention of EDS (EBSCO) and Primo. These services and others are now competing to be the primary means by which users find things in library collections.
Search, whether in-library or out, is a winner-takes-all game. If one search engine gets ahead of the others in terms of market share, it can take the greater amount of information gained from that larger share to improve the search service. An improved search service is likely to win over new libraries and users, which will in turn improve market share further. This is known as the law of increasing returns. While it is unlikely that anyone will achieve a 100% market share, we should be prepared for the day when the bulk of library collections are mediated by the search service of a single commercial vendor, a vendor (like Serials Solutions/ProQuest and EBSCO) that may be tempted to steer search to properties under their own control. Local discovery tools, in other words, could evolve into marketing tools. Am I finding the best document to answer my question or am I finding the document in which the search engine’s management has an interest? Already some people are whispering about a lack of impartiality on the part of some discovery tools (I have no evidence of such bias myself), and one can be sure that all the discovery services are going to be closely scrutinized, especially by the publishers who feel that they are not getting their due.
Library discovery, in other words, is not simply a matter of means to satisfy users; it is also a competitive battlefield.
We should look for discovery services with a big market share to begin to think about the value of their aggregate data. This could become a new business in itself. When you provide search services for one institution, you collect usage data for that one institution. But when you provide such services for one thousand institutions or more, you can compare the research activities across institutions. Let’s take a leap here and imagine a service for information usage that is analogous to the stock market indexes assembled by Standard & Poor’s. My institution prides itself in competing favorably with that other institution with the grander name, but the new information index provides real data on what is used, how it is used, and, very importantly, how much it is used. This cross-institutional data (which is owned by commercial entities) thus becomes a proxy for the research activity of the university itself. So we may see a day when we routinely rank universities by how they score in the Summon or Primo Information Index, cousin to the S&P 500. Provosts may want to take a look at this and may instruct their libraries to subscribe to the new service. Thus collection management spawns new services, which put an incremental tax on the library’s budget.
Like Columbus, we discover nothing that has not been discovered before. But in our rediscovery, we add new layers of meaning to the process, out of which comes new products and services.
A footnote to this discussion: the Elsevier ALA reception was in the unfortunate situation of competing with the PBS broadcast of Downton Abbey, which aired at 8:00 P.M. Central Time. As librarians left the Elsevier dessert hall (adorned with two live bulls — this was Dallas, after all) to get back to the TVs in their hotel rooms, we saw yet another instance of the consumer market’s intrusion into academic publishing.