Patron-driven acquisitions (PDA) poses new marketing challenges for book publishers.
In the traditional model, most of a publisher’s efforts are expended around the time of publication. Just prior to publication there is a period where interest in the book is generated by catalogues, social media, the distribution of review copies, and so forth; in the period immediately following publication, there is a strong effort to gain media coverage of the title. But over time the marketing effort tapers off. Titles continue to be listed in catalogues; sometimes new catalogues, online and off, are developed for specific audiences (e.g., a list of all works in anthropology for the anthropology mailing list), but a complete re-energizing of the marketing requires a new event. This could be the publication of a paperback or electronic version of the book or perhaps the creation of a new edition, replete with a new introduction and updates to the text. As the book ages in the marketplace, the marketing staff turns away, directing their attention to the many new books coming through the pipeline. Historically, publishers have not been overly concerned about the overall life cycle of the book. Post-publication, a book becomes the concern of libraries and the individuals who purchase it.
With PDA, publishers face a situation where the actual purchase of a book could take place many years after publication. The largest share of purchases will continue to take place in the first year or two after publication, but some portion of a title’s overall sales will be pushed back, taking place only when a library patron makes a specific request. Publishers thus have the new marketing challenge of stimulating demand for many years. If they don’t, the book’s metadata will sit in library catalogues, but no one will seek to access the title. Without such a request for access, there will be no purchase. This is the essence of PDA.
How then to drive sales of PDA titles? The obvious answer is to continue to market the books, but this is more easily said than done. Marketing costs money and only makes sense when the incremental cost of an ongoing marketing effort is more than offset by increased sales (specifically, the increase in the gross margin must exceed the direct cost of the new marketing expenditures). For many academic titles, which may have a small commercial reach, the challenge is to find cost-effective ways to bring books to readers’ attention. That means (obviously) no television ads or taking out a page in the New York Times Book Review, but it may also mean not being able to afford small advertisements in specialized scholarly journals, whether in online or offline editions. The unfortunate fact is that old books compete with new books for resources, and new books almost always win.
Publishers are thus likely to view marketing in the wake of PDA pragmatically. Since sales through PDA are triggered when a patron, viewing the library’s holdings in a catalogue, requests access to a title, publishers will seek (a) to have as many books represented in that catalogue as possible and (b) to have the descriptions of those books be as complete as possible so as to prompt a user to want to read more. Practical book discovery for PDA, in other words, is largely a matter of search-engine marketing, where the search engine or engines consist of a variety of tools that lead a user to view a specific book record.
Before digging in deeper on searching the catalogue, it’s important to recognize that there are useful means to bring users to the catalogue in the first place, though all of them fall under the general rubric of search-engine marketing. For example, a publisher may have a regular blog in which authors contribute guest posts. While the direct or organic traffic to the publisher’s own Web site may be small, those blog posts will be picked up by people who begin their searches on Google or other Internet media. Let’s imagine, for example, an undergraduate who is to write a paper on the economic causes of the American Civil War. The student perhaps begins with Google Web Search and finds that there is a link to a blog post by John Doe on that very subject. The link brings the student to the publisher’s blog and Doe’s post. The post cites Doe’s book on the Civil War, and the student then discovers that that book is listed in his or her institution’s catalogue. That listing may be part of the PDA program. The student requests access to the book and a transaction is consummated silently, without the student even knowing that the book was not owned by the library until that very minute. This marketing is not free, of course, as the publisher must allocate resources to creating and maintaining the blog, but it is a method for keeping books in front of prospective readers.
Some searches will begin not with Google or Twitter or any other Web service but with someone searching directly on the library’s catalogue. And here we have an interesting question: Just what is the library’s catalogue and how can a publisher influence what information gets placed within it?
It’s tempting to think not of the catalogue, singular, but catalogues, plural, as libraries have multiple access points to their collections, but it’s probably more accurate to think of the catalogue with a series of overlays that make the contents of the catalogue more apparent or useful. To some extent, it all depends on the kind of questions you are asking. A librarian, for example, may use the catalogue as a way to determine if a particular print book is located in the main library or in an off-campus storage facility; or a librarian may simply be interested in assessing the inventory of the library’s holdings. A patron, on the other hand, is interested in what he or she can get access to. For inventory purposes, for example, a title in a PDA program may not be viewed as part of the library’s inventory (because it has not yet been purchased), whereas for discovery purposes, the title is very much a part of the catalogue because, if it is available as an ebook, it can be purchased in an instant.
Thus, for publishers, it is the discovery service, not the underlying catalogue itself, that is of greatest interest. Most libraries use one of five options for patrons’ discovery: a proprietary search interface (only the largest libraries would invest in this); the Summon service, which is offered by Serials Solutions/ProQuest; EDS, the abbreviation for EBSCO Discovery Services; WorldCat Local, a service of OCLC; and Primo, a service of Ex Libris. This is where publishers need to concentrate their efforts, as a strong representation of a title in these services could lead to more patrons discovering and requesting it.
Let’s look at this by choosing a single title as an example: 1491 by Charles Mann. This book was published by Random House, the largest trade house, which has no problem in getting its books distributed just about everywhere. If you do a search for this title on WorldCat, you will see that the book is available in thousands of libraries. If you go to the publisher’s site, you will find a succinct description of the book. The presentation at Amazon is much more extensive than that on the Random House site. The representation at Amazon’s direct competitor, Powells.com, is good, but not nearly as fleshed out as on Amazon. The information about the book is far more limited in the catalogue at the Princeton University Library–so Amazon is making a greater effort at making the book discoverable than either the publisher, at least one competitor, and one major research library. (Oddly, when you search on the Princeton catalogue for 1491, you are brought initially to the record for 1493, the author’s follow-up book.) Interestingly, the University of Chicago Library has a much longer entry on this title, and at the bottom of the page there is a reference to Syndetics, a unit of R.R. Bowker (and linked corporately to Serials Solutions and ProQuest). Syndetics is credited with having provided enriched metadata for this entry. Apparently the Chicago library staff has a strategy of increasing discoverability by adding metadata to its catalogue. This is precisely what publishers would want.
If you surf through dozens of libraries, working with a group of titles for which you compare entries, you will find that some libraries have very extensive descriptions of the books in their catalogues, some have brief listings, some include outside reviews, some have links to the GoodReads Web service, and some include links back to Amazon, where a patron can make a purchase. (It’s controversial in library circles, but I would like to see libraries get a commission for referring customers to Amazon.) It’s all over the place. For publishers, this means that one important marketing goal should be to raise the level of metadata in all library catalogues. The technical staff at publishing companies may wish to review the work of Ken Chad, who studied the role of metadata on PDA in a JISC-sponsored survey.
Ironically, every year publishers provide their metadata for free to Bowker, which proceeds to package it and sell it to retailers, wholesalers, and libraries. It would be in the publishers’ interest if all that metadata were freely distributed to libraries. In this regard, it’s interesting to contemplate the precedent of Oxford University Press, which makes the metadata to all its books available as a free download from the OUP site. If OUP’s books have better metadata associated with them across the Internet, it is not an accident.
Let’s be clear about what’s at issue here. PDA potentially erodes the sale of books, but publishers have means to offset this. Good metadata enables better discovery, which in turn leads to more patron interest in a title–and that leads to a request for access and a sale or rental from a PDA program. Publishers have been inconsistent about creating and distributing this metadata, which in turn leads to lower sales. An effective marketing program for publishers operating in the PDA environment is to create high-quality and extensive metadata and see that it gets distributed to discovery services. Publishers should not stop providing information to Bowker, but they may wish to begin to provide it directly to libraries free of charge.
This is a moving target, and it is moving in the right direction. New services are now springing up that provide apps for libraries’ Facebook pages. An interesting aspect of these apps is that they permit users to enhance the descriptions of books. Thus we now have the authoritative information about a book provided by the publisher mingling with user-generated content, not unlike what we find on the Amazon site. The tools of discovery continue to grow.
Discussion
16 Thoughts on "The Problem of Discovery for Patron-driven Acquisitions (PDA)"
In the article world, full text search is vastly superior to metadata for most content discovery. This should be even more true for books, which have much more text per title. Is no one offering full text search on books? No publisher or catalog service? Sounds like an opportunity.
David – surely it depends on the quality of your metadata? I would argue that a search of a well structured and indexed A&I database will give better results than a search of unstructured full-text content. So the message is that book publishers need to be ensuring that their e-books (preferably at the chapter level) are made available to A&I database publishers for indexing; I wonder how many book publishers do this routinely?
Andrea, quality is not the issue. A ten sentence abstract of a thousand sentence document misses almost all of the information in the latter. Discovery means looking for all sorts of things, not just the central topic. Nor is full text unstructured. Books and articles are well structured. Moreover, full text search can utilize advanced semantic algorithms.
Hi, David. I guess it depends what you’re looking for, and in these days of searching for additional insights and “knowledge nuggets”, a full-text search is always going to trump metadata. If what you’re after is a relevant list of references, then the specialist search engine wins (in my opinion). Horses for courses, I guess!
Indeed Andrea, and abstracts are central to discovery. I use them all the time, especially when the document is centered on my topic of interest. But there are many different kinds of discovery, which is something I have been studying for a long time. Mostly in the context of research reports and journal articles, which is why I asked about searching books, about which I know next to nothing.
I am particularly interested in the crosscutting use of methods in science, which I discuss here:
http://scholarlykitchen.sspnet.org/2011/10/11/my-utopian-vision-for-communication-of-scientific-methods/. Methods are often not mentioned in abstracts, or only briefly, because the focus is the findings.
When the devil is in the details, full text search is my church.
In a perfect world you would have both excellent meta-data and full-text search capability, since they perform different functions entirely. Keyword searching doesn’t adequately express abstract concepts or overarching themes of a book or article. Broader subject terms used in library catalogs and A&I databases group similar topics, allowing for browsing. Meta-data represents human intervention that adds meaning; simple keyword matching doesn’t come close.
Patron-driven acquisitions relies on searchers finding, and finding requires the right tool for the job.
Rebecca, I agree that human classification is useful, as long as the particular classification in question is the one the user is seeking. The problem here is that there are many ways to classify the same content. Suppose there are ten ways, and the interests of the users are equally divided among them. Then any given classification satisfies only 10% of the users. Even worse it misclassifies the content, from 90% of the users’ perspectives.
Or as I said, a ten sentence abstract of a thousand sentence document misses almost all of the content. Abstraction always loses content, hence the name. Abstraction is useful in its place, but it is no substitute for search.
Moreover, we have come a long way from “keyword searching.” We now do complex semantic analysis with relevance ranking. I myself recently developed an algorithm that finds most of the journal articles that are related to a given topic, and ranks them by closeness. I use term vector similarity, in which every word used in every article counts. There are no keywords. All that matters is that people are talking about roughly the same thing. How roughly is what we measure.
Of course it all depends on what the user is looking for. My interest is in people who are trying to understand what is going on in a particular area of research. Where science is and where it is going, in detail. But if what you are looking for is a broad catgorization then metadata is fine.
Metadata properly formed is superior to full-text searching if the aim is marketing, which was the point of the post. Metadata may also be superior for information discovery that is not intended to lead to sales, but in fact we don’t have any evidence of this.
Joe, I do not see how marketing is a special case where metadata is superior. Perhaps you can elaborate on your claim. I conjecture that people buy books for just as many different reasons as they read articles. If so then being able to find books based on what they actually say, as opposed to what someone says they say, will be superior in many cases.
David, distribution of metadata is something the publisher has, or can have, control of, because they can distribute this concise, searchable data when and where they see fit. Don’t semantic searches of large bodies of texts require real-time or near-real-time access to the sources one wishes to search? That doesn’t sound very portable or flexible from the publisher’s perspective. So I do believe that they are separate use cases. In other words, marketing requires nimbleness, while research requires depth and traction.
Greg, I actually raised this point as a question, not a plan. But I think a publisher should want to make everything they publish full text searchable, so people can find it to buy, especially under the long time frame Joe describes for PDA. And aggregators should want to aggregate these full text indexes and libraries should include them in their PDA catalogs. Why not?
I’m joining this conversation a little late, but I do agree with David regarding the impact of full text searching. I don’t think the choice here is either/or, both rich metadata and searching of the full text are needed to maximize discoverability.
I believe that full text searching using the Summon discovery layer is possible if the publisher provides the full text to ProQuest to ingest into the Summon index. I’m not sure if any book publishers are providing this content, but it would certainly boost their PDA sales if they did.
Joe is overlooking one very important source of later marketing: book reviews that appear in professional journals over an extended period of time. Publishers have often complained about how long it takes for book reviews to appear, sometimes so long after publication that in the print era the book was no longer available to purchase! But what had been an annoyance then can become a virtue now, in the era of PDA, as a stream of reviews appearing over several years can serve to keep a scholarly book at the forefront of people’s minds. This isn’t to say that publishers cannot do more to take advantage of such reviews, e.g., by linking to them from the web page for a book on their own site and otherwise integrating them into book metadata.
This is an articulate summary of issues about PDA. I think an extremely critical aspect of this new purchasing paradigm is to get quality overviews of new academic works into the library catalogs where they become subject to discovery, with library acquisitions options to acquire the material immediately (ebook download), or through more conventional acquisition in print or ebook, even inter-library loan. Maintaining the database of reviews over extended time periods is important for “the long tail” purchase in the social sciences and humanities. Sources other than publisher blurbs are important for their more independent and economically-detached reviews; e.g., and not mentioned above, the Book News’ database of “Reference and Research Book News” covering some 18,000 titles yearly, a thirty-year database of 400,000 titles.
Reblogged this on Books, Libs, Scripts and commented:
Transforming library online public access catalogs into a more Amazon-like experience — a transformation that carries substantial value for library users as well as publishers dealing with the advent of PDA — call for, I believe, not only more data/metadata but also superior search algorithms and the inclusion of new functionalities like the “explore similar items” feature.