Three years ago, I felt called to the unhappy task of pointing out the many points of failure in what Lettie Conrad calls the “researcher experience.” I observed that “Instead of the rich and seamless digital library for scholarship that they need, researchers today encounter archipelagos of content bridged by infrastructure that is insufficient and often outdated.” Perhaps researchers need a supercontinent.
Since then, Sci-Hub has come on the scene, and publishers are in some combination of being outraged and/or scared. It may be that these businesses are too late. The formula for stabilizing a sector facing rampant piracy is the combination of legal action and seamless central access to content that allowed the music industry to find a future after Napster. Thus far, for scholarly publishers, legal action is not working, with cross-border enforcement challenging in this geopolitical moment. But what about the seamless centralized access to content? How is this sector going to accept the tectonic shift necessary to establish the supercontinent?
One of the underlying problems, as Toby Green eloquently described in a recent presentation, is that we have actually taken a step backwards. In the print era, the library pulled in content from many publishers and through cataloging, classification, and shelving, merged them together to provide something like a single “platform” for user access (at least in the idealized case of the comprehensive research library). In the digital era, each publisher has established its own content platform, to the detriment of the researcher experience. Discovery is fragmented, leading to substantial library investment in order to provide single-index whole-collection search. Access is difficult, since authorization must take place separately for each platform. The personalization that researchers crave to improve their productivity is challenging, since user accounts are siloed on a platform basis. Band-aid solutions like RA21 will do nothing more than attempt to build bridges across archipelagos, rather than solving the underlying fragmentation problem.
So, let us consider another model. I have previously examined whether a single user account could carry with it both credentials to permit access and data to enable personalization. Another approach would be to establish a truly central platform for discovery, access, and use. Let’s begin by imagining what a truly seamless platform would entail:
- “Everything” can be discovered, regardless of publisher, and
- “Everything” can be accessed directly, through institutional licenses where those exist (and possibly through content sales otherwise), without leaving the platform.
Such a platform would help publishers monetize their existing library licenses by counting article accesses that go beyond the publisher platform, although potentially at the cost of publisher branding, advertising, and other non-subscription revenue. And, it would still be valuable were today’s hybrid environment to yield to full open access, enabling discovery, personalization, and use.
In the three sections that follow, I group some key emerging solutions in the ways that other publishers might see them, distinguishing between Publisher-Related Models, Other Version of Record Models, and Disruptive Models.
Several scientific publishers are repositioning themselves to focus on all-content discovery and access. In addition to the overall research workflow suites that each has developed to ensure deeper access to the scientific research enterprise and the data and analytics businesses they believe they can foster, there is very interesting activity taking place to reformulate the distribution channels. Elsevier’s suite of discovery services include Mendeley and Scopus. Digital Science recently launched Dimensions, and it is included in this section because it is owned by Holtzbrinck, which is also the majority owner of Springer Nature. Whether these majors will find a way to shift the archipelago into a supercontinent is a key strategic question.
Elsevier’s Mendeley has proposed that it manage compliance with institutional licensing while providing usage data to publishers. They are not interested in facilitating content sales for other publishers. If the underlying model can be made to work in Mendeley, there is no reason to think it should not be extended to other Elsevier properties, such as Scopus. Whether other publishers will choose to enable Elsevier to become a more active distributor of their content remains to be seen.
Digital Science released its Dimensions service, in many ways a competitor of citation databases like Scopus and Clarivate’s Web of Science. One of the key differentiators is that Dimensions includes ReadCube Discover integration, enabling on-platform reading of content. ReadCube has a longstanding business in helping publishers monetize their content through article sales to individuals, which can be seen as an added feature. Although Dimensions is by no means fully built out, and ReadCube Discover is missing several key content partners, this model is more realized as a full discovery and access platform than any other.
Transforming from being a publisher to being a platform for all content is a tricky matter, since the complete platform must be at least plausibly neutral in driving access across all publishers. Interestingly, these services’ greatest value will come through current awareness, where plugging into the research workflow offerings that both Elsevier and Digital Science are building out is vital.
Other “Version of Record” Models
We’ve seen efforts to build the supercontinent in other parts of the content provider community as well. For example, both Ebsco and ProQuest elected to build services through which all content not directly available through their own aggregations can be discovered. These services included not only the publisher “version of record” but also alternative sources such as these providers’ own licensed aggregations. This led to a period of questioning the neutrality of the “master switch” that each was trying to develop, as to whether the discovery service each provided would favor its own content platform or other preferred sources. But, whatever skepticism there has been, it did not stop numerous academic libraries from adopting these discovery systems as the default search service for their communities. The problem has been that, notwithstanding efforts to improve the linking experience, too small a share of content is directly accessible, and the experience associated with linking to hundreds of other platforms constitutes a stumbling block in all too many cases. Can these services provide a CASA-like model that reduces the barriers to access substantially, as has been proposed for Primo?
Those without any apparent content interests of their own are nevertheless taking an interest in the supercontinent role as well. David Worlock recently wrote with remarkable foresight about the importance of the Kopernio model. Regardless of green, gold, or licensed status, Kopernio wants to help the user gain access to the content while helping the publisher get recognized for the content access (and as a result at least thus far does not work with non-publisher platforms such as EBSCO). The recent news that Clarivate acquired Kopernio raises interesting questions about whether its Web of Science, EndNote, and other Clarivate properties could over time grow into a supercontinent for discovery and access. Will Annette Thomas’s assertion that “Clarivate is the one company that is truly neutral in the researcher ecosystem” serve as an important competitive differentiator in the supercontinent role?
To make this work either through publishers or through community third parties requires several components, only some of which currently exist:
- A database of “all the content,” in a form suitable not just for indexing/discovery but also for providing access.
- A list of user entitlements based principally on library licenses (apparently existing knowledge bases are considered to be inadequate for this purpose). Kopernio declined to explain to me how they have assembled this feature, calling it their “secret sauce.” Ideally, this approach should recognize that users have multiple affiliations and therefore many sources of “entitlement.”
Libraries and publishers alike should be examining the possibility of disruptive models as well. Models that disintermediate the publisher’s version of record may be on the horizon.
In monitoring the Chan Zuckerberg Initiative, which purchased Meta and has invested in bioRxiv, there is reason to wonder about its plans. Joe Esposito has written that “Meta did not set out to disrupt publishers, nor is it interested in competing with what publishers do,” but what approach its new owners may take remains to be seen. Is there is a road to build a community platform that avoids publisher engagement altogether?
In acquiring SSRN and bepress, Elsevier is also developing the opportunity to make an end run around publishers with which it competes. Through these services, it now has the ability to substitute the the preprint for other publishers’ version of record in providing seamless access to an increasing share of the published literature. It could over time deploy this ability in some very interesting ways relative to other publishers.
ResearchGate has tried to serve as a disrupter in this space, providing one-stop discovery and access, but its model has not been compatible with publisher interests. As publishers look to put an end to ResearchGate’s violations, the split between Elsevier and Springer Nature in terms of tactics has been fascinating. Just as Mendeley’s model became more publisher-friendly after its acquisition by Elsevier, is it possible for ResearchGate to join the community somehow as well? Or will it continue to act as a disrupter?
Google Scholar has actually moved in the other direction. Recognizing what a very high share of discovery it accounts for, Scholar has created Campus-Activated Subscriber Access to help reduce barriers to off-site access for its users that have licensed access to content. This will help to further secure Scholar’s dominance as a discovery starting point, potentially reifying the archipelago dynamic.
The shift from archipelagoes to supercontinent is not without potential downsides. Any service in this space can be expected to dramatically reduce the usage of publisher platforms (even though it might increase the aggregate use of publisher content). This raises real questions for Atypon, HighWire, and SilverChair, among others. For publishers, though, this could work well, allowing them to reduce over time the investment in their own platforms.
But at what cost? There have been substantial behind-the-scenes tussles associated with the business models for discovery services. Should publishers pay to have their content included? Will they be paid for contributing their content? With respect to the library discovery services, a truce appears to have emerged in which there typically is no direct exchange of money in return for content inclusion. But, much depends on the comparative importance of the discovery service to the publisher and the content to the discovery service, and this model can shift over time. Will libraries pay to provide these platforms for their affiliates, as they do now for discovery services? Or will these services be monetized through advertising or analytics? The business models for these emerging platforms vary widely.
The key for publishers is to ensure that the community platforms support the version of record and, while it persists, the library licensing model. Risks will come about from models that do not support a version of record. Models that link to preprints should be seen as worrisome to the publishers, since they reduce usage of the version of record, and in turn are beneficial to libraries that would like to disrupt the current system.
In that sense, publishers of all stripes may wish to think about how to shape this emergent platform environment. Several continents is less worrisome than one supercontinent, it can be argued, since a single supercontinent would produce unmanageable dependence for publishers. So, if Google or another disrupter were to take a more aggressive posture, it might motivate other publishers to take shelter with Elsevier or others in the community, but then it may well be too late to shape this environment.
Libraries should also be pushing hard to shape this environment in ways that are right for their institutions and their users. Where is the seamless library-community-controlled platform for discovery and access? Does SHARE have this in its ambit? Does OCLC? The time to enable not only seamless but also strategically sound discovery and access is yesterday.
It is also notable that several of the version of record models are being offered by companies with a publishing interest of their own. By taking a more active role in discovery and access for their publishing competitors, Elsevier and Holtzbrinck (through Digital Science) are positioning themselves in a highly influential position at key chokepoints. While their products in this space may be entirely publisher-neutral today, it will be important for other publishers to ensure that their agreements with them contain no hidden switches that can be thrown down the road to enable preferential treatment, as well as to understand how current awareness fits in with other other aspects of the workflow empires that both of these companies are cobbling together. And, although in terms of current ownership there are some key differences between Digital Science and Clarivate, it is important to recognize that the long-term ownership of both is uncertain.
Will a supercontinent emerge for discovery and access? Time will tell. Meanwhile, users have long become impatient of the wait.
I thank Todd Toler of Wiley for organizing a panel at the recent STM US Annual Conference, which he invited me to chair, that motivated me to explore these issues anew. The panel included fantastic contributions from Gaby Appleton, Managing Director of Mendeley (and former Director of Strategy for its parent Elsevier), Yann Mahé, Director Innovation & Business Development, MyScienceWork, Roy Kaufman, Managing Director, Business Development, Copyright Clearance Center, and Rob McGrath, CEO of Holtzbrinck/Digital Science Readcube. I also thank Angela Cochrane, Danielle Cooper, Lisa Janicke Hinchliffe, and Kimberly Lutz, for reviewing a draft.