Today, journal content distribution is largely a publisher activity. Researchers are able to access journal articles through each of hundreds of publisher-specific content distribution websites. But, as the extensive use of Sci-Hub, repository versions, and other workarounds by entitled users makes clear, publishers are losing online traffic on their own platforms. What does this mean for the future of the publisher site and the hosted platform companies?
In their existing platform environment, publishers have struggled to manage the intersections of content discovery and access, publisher and publication brand and context, and business issues. All three of these aspects are vital individually. But publishers seem to have become accustomed to thinking that their content distribution strategy needs to address all three of these key elements in the same way. In recent years, some publishers and workflow providers have attempted to move towards a model that separates content distribution from the business issues of pricing and sales. In light of this, I want to explore the individual publisher content distribution strategy — and question the long-term role of publisher-specific content distribution platforms.
My piece today is focused on the issues facing journal publishers, particularly in the sciences. For a number of reasons, including the slower format transition to digital, content distribution models for books — especially for humanities and social sciences monographs — have developed differently. I will consider those separately in the future.
Three Content Distribution Models
We think of publishers as having platforms, but in fact many types of discrete systems are utilized by publishers. These vary from manuscript submission and management systems, which are increasingly important to publishers of all kinds, to user access tools. In this piece, I am focusing on the access platforms through which content is made available to readers, whether on a licensed or open basis. As we look across the industry, we see three fundamental models for these types of platforms have emerged in different sectors or to serve different needs, recognizing of course that there are also publishers that are using more than one of these models for different categories of their content.
Some smaller publishers, for example a variety of scholarly societies, publish through a larger publisher. Wiley, Elsevier, Springer Nature, Oxford, and Cambridge are among those that have major businesses in publishing on behalf of smaller publishers. In these cases, pricing, marketing, and sales are generally handled by the publishing service provider. The smaller publisher essentially outsources much of the business aspects of publishing, retaining the editorial component, generating revenue from the deal.
Seeing the opportunity to help smaller scientific publishers maintain their independence while achieving scale, early leaders created alternatives, including Ingenta and Highwire Press. Thus began the era of each publisher having a branded site on common infrastructure. Today, many publishers contract with vendors, a group that has also come to include Wiley’s Atypon and the privately held SilverChair, to provide the technology for their distribution websites. Libero is one example of an open source effort to provide similar types of content hosting and distribution, albeit with a smaller current installed base. These common infrastructures allow the costs of developing these distribution platforms to be spread widely. Regardless of whether these platforms are cloud-based or locally hosted, they enable the publisher to retain control of its business — pricing, branding, and sales — as well as editorial control.
In the third route, we have seen a handful of the largest publishers build their own platforms. Elsevier and Springer Nature are examples of the class of publishers that develop platform infrastructure for many categories of their publications in house, and their platforms are intended to serve as a value differentiator. In Elsevier’s case, this is clearly connected with its intentions to integrate across a variety of platforms it owns — its journal distribution platform (ScienceDirect), with article preprint platforms (SSRN and bepress Digital Commons), with manuscript submission and management (Aries), with research information management (Pure), etc., as Lisa Janicke Hinchliffe and I have each detailed in a series of posts here at The Kitchen.
Not all of the largest publishers that have sought to build their own platform infrastructure have found that the resources required are justified by the value added. After efforts to build and maintain its own proprietary platform, Wiley eventually purchased Atypon, allowing it greater control than if it were just one of the many customers of this shared platform.
It is the smaller publishers that could not possibly each develop its own platform technology whose dilemma I wish to consider in contrast with the largest publishers, which have fairly clear strategic rationales for maintaining their own platforms. These smaller publishers face the choice in Model 1 to outsource their business and thereby integrate with their business partner’s hosting platform or in Model to 2 to adopt a shared technology platform while retaining branding, pricing, marketing, sales (or in some cases, using a combination of the two). In neither case are they positioned to integrate with the elements of a broader reader workflow process (beyond reliance on third party discovery services for traffic flow), let alone a larger researcher workflow process.
The current situation has a number of fairly serious drawbacks for both readers and publishers alike. Each publisher-specific platform represents a meaningful impediment to seamless access to content and use of it in an integrated fashion. The near term drawback is in legitimate reader access to scholarly content. The broader drawback of publisher-specific access silos is the inability to work with and across content without substantial impediments. In a piracy laden environment, one that is transitioning fitfully towards open, these are not only user impediments but also as a result business impediments.
Previously, I have examined the challenges in legitimate reader access to scholarly content, framing it as a problem in user experience. It is one of the issues leading users to gray market and pirate sites, and it is absolutely a key factor driving users, even those who have otherwise legitimate access to materials, to Sci-Hub.
Open access advocates are quick to pin blame for these challenges on authentication mechanisms. And, in a way, the STM community seems to have agreed that this is a key problem by fostering the RA21 initiative that is attempting to improve authentication paths to licensed resources.
But the challenges that are driving readers to Sci-Hub are not just about authentication. Even if authentication were successfully addressed, or no authentication were required at all, users would still be frustrated by publisher-specific platform silos. Today, they simply try to grab PDFs and pull them into a local context, because they cannot even imagine the kind of system and interface that would allow them actually to work with and across individual articles seamlessly. To whatever extent there is a longer term transition towards working across content silos, whether through analysis of underlying datasets, machine analysis of articles, or a variety of other use cases, the drawbacks of having content siloed on an unnecessary number of websites will go up in significance. As one publishing technology executive explained to me a few months ago, RA21 without addressing these more fundamental challenges would be “a disembodied spirit” — lacking the key ingredients to solving the full user experience challenges as I explain below and not just the more limited authentication issues.
Cross-platform issues have been addressed to some degree, but largely in ways that chip away at the value of publisher brand. The “discovery service to access platform” workflow has been optimized to enable “grab and go” content seeking behaviors, whether from Google Scholar, Web of Science, Summon, Meta, Scopus, EDS, and many others that compete to serve the discovery starting point role. Many of these services are moving beyond search as their primary use case, to include various kinds of increasingly advanced alerts that are steadily chipping away at the importance of the journal-specific table of contents alerts for current awareness. CrossRef’s efforts have enabled cross-platform content access. All these dramatic improvements in user experience each in its way addresses the shortcomings of publisher-specific content hosting while commensurately diluting the potential brand value to the individual publisher of maintaining its own hosting.
Aggregation vs. Syndication
These efforts to work around the shortcomings of publisher-specific content distribution have two major alternatives, both of which are about creating cross-publisher content hubs which ideally would contain “all the content” — the approach widely known as aggregation, which is a well established business model, and what I call syndication, which could emerge.
First, in aggregation, services like those from ProQuest and EBSCO license rights to redistribute selected parts of many publishers’ portfolios to a broader array of institutions than typically would license extensive sets of scholarly journals from the publishers directly. Aggregation models return revenue to the publishers, but typically not enough to serve as a replacement to direct subscriptions (although EBSCO in particular has some exclusive arrangements that push further in this direction). Aggregations represent a meaningful secondary source of income for publishers. But, given the competitive pressures between publishers and aggregators that Joe Esposito has discussed, only the most carefully balanced aggregation models provide a long-term stable relationship. Rarely, if ever, in scholarly publishing have they alleviated the need for the publisher to maintain a primary publishing platform and business.
It is possible that syndication can offer a more viable alternative. Unlike aggregation, syndication involves a clear separation between hosting and access provision, on the one hand, and business considerations such as pricing and sales. Through syndication, the primary publisher can continue to set its own prices at a title or bundled level. It transmits information about resulting entitlements to cross-publisher content hubs (which I have called “supercontinents”). Those cross-publisher hubs would provide content access to anyone with an appropriate entitlement. Through distributed usage logging, information about usage would be transmitted from the cross-publisher hubs to the primary publishers, to help them with author relationships and marketing as well as library marketing, pricing, and sales. Syndication is a strategy for publishers to retain at least some degree of control of their content — and usage information about it — in an open access environment, by providing legitimate or preferred avenues for its distribution.
If syndication develops, it appears that these cross-publisher hubs will develop from some of the discovery services. Contenders include citation databases like Web of Science, Scopus, Dimensions, and their open alternatives, as well as other discovery services. Many of them are working to address the barriers from discovery to access, including various forms of direct linking that would culminate in the logical conclusion of enabling access directly on the discovery platform. Digital Science is perhaps furthest along in this transition, through Anywhere Access on Dimensions, which enables same-platform discovery and then access to an article, an early version of the supercontinent model. Depending on how Web of Science chooses to integrate Kopernio, and how Elsevier’s strategy proceeds, we may see strong supercontinents from these players as well. On the other hand, we could see emerge a series of hubs each with tools optimized to the workflows and methodologies of an individual discipline or disciplinary group.
Syndication might have different meanings across publishing houses. Some might wish to try to preserve their existing platforms and restrict syndication only to other types of content distribution. For example, some publishers or providers might urge that syndication be focused only on “scholarly collaboration networks” like Mendeley. But there is also every reason to wonder if ResearchGate could emerge as a cross-publisher content hub; the strategic tussle about ResearchGate is in my analysis one of the key factors impeding the move towards syndication. It is easy to see how ResearchGate blurs the lines between scholarly collaboration network and discovery service. Syndication can only proceed to the extent that a critical mass of major publishing houses can agree on a common set of protocols.
In recent years, some publishers have begun to face the prospect not of content “leakage” — which is inevitable in a digital environment due to piracy — but actually losing control, as one executive described it to me, of the library channel. Syndication may not have been anyone’s ideal. But if it allows publishers to maintain the licensed content environment and provides a reasonable pathway towards an open access environment with a stable platform order, it may be adopted pragmatically. And if, as a result, it becomes possible for a publisher to control the terms of its publishing business while simply distributing its content through third party platforms, then syndication raises all sorts of questions for publisher technology investments.
Taken to the extreme, we might wonder for what purposes, if any, publishers would need to continue to have their own distribution platforms. Can syndication provide an alternative distribution model allowing publishers to reduce or eliminate their investments in branded silos? If publishers agree to pursue content syndication models, and a growing share of content access takes place through cross-publisher hubs, might we expect to see a meaningful reduction in publisher investment away from publisher-specific hosting platforms?
Reflecting on brand, one of the most significant needs for publisher-specific access platforms may not be for readers but rather for author marketing purposes. Author marketing could perhaps take place through a slice of a supercontinent that is effectively branded by publisher and title for author marketing purposes. It might also become a separate interface altogether.
The most acute implications of syndication may be not for publishers themselves, most of which might be perfectly happy to redirect their technology investments, but rather for those that provide publishing platforms. What role will platform providers, such as Atypon, HighWire, Ingenta, and SilverChair, have for themselves if syndication takes hold? Is there a scenario where their publisher-specific platform offerings are no longer needed or in which the value that they afford declines substantially? Are some of them already in the process of transitioning to alternative models? Or might some or all of them — presumably, ones with least technical debt and most strategic agility — see an opportunity to compete as one of the “supercontinents” or at least a disciplinary continent providing syndicated access to all content?
I thank David Crotty, Joe Esposito, Lisa Janicke Hinchliffe, Kimberly Lutz, and Todd Toler for helpful comments on drafts of this piece.