Three years ago, I felt called to the unhappy task of pointing out the many points of failure in what Lettie Conrad calls the “researcher experience.” I observed that “Instead of the rich and seamless digital library for scholarship that they need, researchers today encounter archipelagos of content bridged by infrastructure that is insufficient and often outdated.” Perhaps researchers need a supercontinent.
Since then, Sci-Hub has come on the scene, and publishers are in some combination of being outraged and/or scared. It may be that these businesses are too late. The formula for stabilizing a sector facing rampant piracy is the combination of legal action and seamless central access to content that allowed the music industry to find a future after Napster. Thus far, for scholarly publishers, legal action is not working, with cross-border enforcement challenging in this geopolitical moment. But what about the seamless centralized access to content? How is this sector going to accept the tectonic shift necessary to establish the supercontinent?
One of the underlying problems, as Toby Green eloquently described in a recent presentation, is that we have actually taken a step backwards. In the print era, the library pulled in content from many publishers and through cataloging, classification, and shelving, merged them together to provide something like a single “platform” for user access (at least in the idealized case of the comprehensive research library). In the digital era, each publisher has established its own content platform, to the detriment of the researcher experience. Discovery is fragmented, leading to substantial library investment in order to provide single-index whole-collection search. Access is difficult, since authorization must take place separately for each platform. The personalization that researchers crave to improve their productivity is challenging, since user accounts are siloed on a platform basis. Band-aid solutions like RA21 will do nothing more than attempt to build bridges across archipelagos, rather than solving the underlying fragmentation problem.
So, let us consider another model. I have previously examined whether a single user account could carry with it both credentials to permit access and data to enable personalization. Another approach would be to establish a truly central platform for discovery, access, and use. Let’s begin by imagining what a truly seamless platform would entail:
- “Everything” can be discovered, regardless of publisher, and
- “Everything” can be accessed directly, through institutional licenses where those exist (and possibly through content sales otherwise), without leaving the platform.
Such a platform would help publishers monetize their existing library licenses by counting article accesses that go beyond the publisher platform, although potentially at the cost of publisher branding, advertising, and other non-subscription revenue. And, it would still be valuable were today’s hybrid environment to yield to full open access, enabling discovery, personalization, and use.
In the three sections that follow, I group some key emerging solutions in the ways that other publishers might see them, distinguishing between Publisher-Related Models, Other Version of Record Models, and Disruptive Models.
Several scientific publishers are repositioning themselves to focus on all-content discovery and access. In addition to the overall research workflow suites that each has developed to ensure deeper access to the scientific research enterprise and the data and analytics businesses they believe they can foster, there is very interesting activity taking place to reformulate the distribution channels. Elsevier’s suite of discovery services include Mendeley and Scopus. Digital Science recently launched Dimensions, and it is included in this section because it is owned by Holtzbrinck, which is also the majority owner of Springer Nature. Whether these majors will find a way to shift the archipelago into a supercontinent is a key strategic question.
Elsevier’s Mendeley has proposed that it manage compliance with institutional licensing while providing usage data to publishers. They are not interested in facilitating content sales for other publishers. If the underlying model can be made to work in Mendeley, there is no reason to think it should not be extended to other Elsevier properties, such as Scopus. Whether other publishers will choose to enable Elsevier to become a more active distributor of their content remains to be seen.
Digital Science released its Dimensions service, in many ways a competitor of citation databases like Scopus and Clarivate’s Web of Science. One of the key differentiators is that Dimensions includes ReadCube Discover integration, enabling on-platform reading of content. ReadCube has a longstanding business in helping publishers monetize their content through article sales to individuals, which can be seen as an added feature. Although Dimensions is by no means fully built out, and ReadCube Discover is missing several key content partners, this model is more realized as a full discovery and access platform than any other.
Transforming from being a publisher to being a platform for all content is a tricky matter, since the complete platform must be at least plausibly neutral in driving access across all publishers. Interestingly, these services’ greatest value will come through current awareness, where plugging into the research workflow offerings that both Elsevier and Digital Science are building out is vital.
Other “Version of Record” Models
We’ve seen efforts to build the supercontinent in other parts of the content provider community as well. For example, both Ebsco and ProQuest elected to build services through which all content not directly available through their own aggregations can be discovered. These services included not only the publisher “version of record” but also alternative sources such as these providers’ own licensed aggregations. This led to a period of questioning the neutrality of the “master switch” that each was trying to develop, as to whether the discovery service each provided would favor its own content platform or other preferred sources. But, whatever skepticism there has been, it did not stop numerous academic libraries from adopting these discovery systems as the default search service for their communities. The problem has been that, notwithstanding efforts to improve the linking experience, too small a share of content is directly accessible, and the experience associated with linking to hundreds of other platforms constitutes a stumbling block in all too many cases. Can these services provide a CASA-like model that reduces the barriers to access substantially, as has been proposed for Primo?
Those without any apparent content interests of their own are nevertheless taking an interest in the supercontinent role as well. David Worlock recently wrote with remarkable foresight about the importance of the Kopernio model. Regardless of green, gold, or licensed status, Kopernio wants to help the user gain access to the content while helping the publisher get recognized for the content access (and as a result at least thus far does not work with non-publisher platforms such as EBSCO). The recent news that Clarivate acquired Kopernio raises interesting questions about whether its Web of Science, EndNote, and other Clarivate properties could over time grow into a supercontinent for discovery and access. Will Annette Thomas’s assertion that “Clarivate is the one company that is truly neutral in the researcher ecosystem” serve as an important competitive differentiator in the supercontinent role?
To make this work either through publishers or through community third parties requires several components, only some of which currently exist:
- A database of “all the content,” in a form suitable not just for indexing/discovery but also for providing access.
- A list of user entitlements based principally on library licenses (apparently existing knowledge bases are considered to be inadequate for this purpose). Kopernio declined to explain to me how they have assembled this feature, calling it their “secret sauce.” Ideally, this approach should recognize that users have multiple affiliations and therefore many sources of “entitlement.”
Libraries and publishers alike should be examining the possibility of disruptive models as well. Models that disintermediate the publisher’s version of record may be on the horizon.
In monitoring the Chan Zuckerberg Initiative, which purchased Meta and has invested in bioRxiv, there is reason to wonder about its plans. Joe Esposito has written that “Meta did not set out to disrupt publishers, nor is it interested in competing with what publishers do,” but what approach its new owners may take remains to be seen. Is there is a road to build a community platform that avoids publisher engagement altogether?
In acquiring SSRN and bepress, Elsevier is also developing the opportunity to make an end run around publishers with which it competes. Through these services, it now has the ability to substitute the the preprint for other publishers’ version of record in providing seamless access to an increasing share of the published literature. It could over time deploy this ability in some very interesting ways relative to other publishers.
ResearchGate has tried to serve as a disrupter in this space, providing one-stop discovery and access, but its model has not been compatible with publisher interests. As publishers look to put an end to ResearchGate’s violations, the split between Elsevier and Springer Nature in terms of tactics has been fascinating. Just as Mendeley’s model became more publisher-friendly after its acquisition by Elsevier, is it possible for ResearchGate to join the community somehow as well? Or will it continue to act as a disrupter?
Google Scholar has actually moved in the other direction. Recognizing what a very high share of discovery it accounts for, Scholar has created Campus-Activated Subscriber Access to help reduce barriers to off-site access for its users that have licensed access to content. This will help to further secure Scholar’s dominance as a discovery starting point, potentially reifying the archipelago dynamic.
The shift from archipelagoes to supercontinent is not without potential downsides. Any service in this space can be expected to dramatically reduce the usage of publisher platforms (even though it might increase the aggregate use of publisher content). This raises real questions for Atypon, HighWire, and SilverChair, among others. For publishers, though, this could work well, allowing them to reduce over time the investment in their own platforms.
But at what cost? There have been substantial behind-the-scenes tussles associated with the business models for discovery services. Should publishers pay to have their content included? Will they be paid for contributing their content? With respect to the library discovery services, a truce appears to have emerged in which there typically is no direct exchange of money in return for content inclusion. But, much depends on the comparative importance of the discovery service to the publisher and the content to the discovery service, and this model can shift over time. Will libraries pay to provide these platforms for their affiliates, as they do now for discovery services? Or will these services be monetized through advertising or analytics? The business models for these emerging platforms vary widely.
The key for publishers is to ensure that the community platforms support the version of record and, while it persists, the library licensing model. Risks will come about from models that do not support a version of record. Models that link to preprints should be seen as worrisome to the publishers, since they reduce usage of the version of record, and in turn are beneficial to libraries that would like to disrupt the current system.
In that sense, publishers of all stripes may wish to think about how to shape this emergent platform environment. Several continents is less worrisome than one supercontinent, it can be argued, since a single supercontinent would produce unmanageable dependence for publishers. So, if Google or another disrupter were to take a more aggressive posture, it might motivate other publishers to take shelter with Elsevier or others in the community, but then it may well be too late to shape this environment.
Libraries should also be pushing hard to shape this environment in ways that are right for their institutions and their users. Where is the seamless library-community-controlled platform for discovery and access? Does SHARE have this in its ambit? Does OCLC? The time to enable not only seamless but also strategically sound discovery and access is yesterday.
It is also notable that several of the version of record models are being offered by companies with a publishing interest of their own. By taking a more active role in discovery and access for their publishing competitors, Elsevier and Holtzbrinck (through Digital Science) are positioning themselves in a highly influential position at key chokepoints. While their products in this space may be entirely publisher-neutral today, it will be important for other publishers to ensure that their agreements with them contain no hidden switches that can be thrown down the road to enable preferential treatment, as well as to understand how current awareness fits in with other other aspects of the workflow empires that both of these companies are cobbling together. And, although in terms of current ownership there are some key differences between Digital Science and Clarivate, it is important to recognize that the long-term ownership of both is uncertain.
Will a supercontinent emerge for discovery and access? Time will tell. Meanwhile, users have long become impatient of the wait.
I thank Todd Toler of Wiley for organizing a panel at the recent STM US Annual Conference, which he invited me to chair, that motivated me to explore these issues anew. The panel included fantastic contributions from Gaby Appleton, Managing Director of Mendeley (and former Director of Strategy for its parent Elsevier), Yann Mahé, Director Innovation & Business Development, MyScienceWork, Roy Kaufman, Managing Director, Business Development, Copyright Clearance Center, and Rob McGrath, CEO of Holtzbrinck/Digital Science Readcube. I also thank Angela Cochrane, Danielle Cooper, Lisa Janicke Hinchliffe, and Kimberly Lutz, for reviewing a draft.
14 Thoughts on "The Supercontinent of Scholarly Publishing?"
Excellent presentation around the key issue I raised in a recent Scholarly Kitchen, and I write “around the key issue” because it never specifically raises the consideration of competition or antitrust.
Fragmentation is problematic, but less so as a matter of public policy than is monopoly, especially around ideas and their expression. In music, to use the example oft cited and with which I have extensive experience, the competitions questions run deep and have led to on-going consent decrees with the major aggregators of rights. Every aggregator must come to grips with this issue; CCC waited years for DOJ clearance before acting.
Hoping for one player to emerge is problematic under this paradigm. Special care must be taken to ensure competitions problems do not sink the boat.
Three basic ways to address any problem: Compulsion, persuasion and chance.
The first is likely government-based, compelling (as in statutory licensing) a solution, but let’s be clear: Such a compelled solution would lead to a sovereign solution in over 150 countries, and perhaps still more fragmented than before. Besides, replacing markets with government carries its own still more difficult weight.
The best approach is persuasion, a hope that the parties come together as a competitive coalition, preserving — even making — a market for the expression of scholarly works. This strikes me as the best hope and it is likely a “Spotify,” for want of a better word, that both competes and aggregates.
The last approach, leaving it to chance or hope, is where we are at and it is not working. Fitzgerald had this one pegged: Boats against the current, borne back ceaselessly into the past.
Roger – I would like to correct an error above. The Chan Zuckerberg Initiative (CZI) has not “invested in bioRxiv”. bioRxiv is a non-profit initiative of Cold Spring Harbor Laboratory and receives grant funding from CZI. This is an important difference.
Richard, Thank you for the clarification. I know that CZI is not a traditional foundation, so I tried to leave the language open-ended, but I certainly didn’t mean to leave the impression that CZI has taken an ownership stake. Is the grant a traditional grant as one might receive from a more traditional philanthropy or funding agency? -Roger
Yes – much the same as the many other grants that a big lab like CSHL naturally receives.
The quote, in full, from The Great Gatsby is:
“So WE beat on, boats against the current, borne back ceaselessly into the past.”*
which fits well with Wordsworth:
We will grieve not, rather find strength in what remains behind.*
As you so well note, there is a striking parallel between academic publishers and the music industry As Maxwell’s model, creating more journals/bundles, etc, starts to unravel and articles, like music, gets “ripped” from journals
If one follows the literature, there is a sense that the first business to create a new platform will dominate (think Amazon, Google, Facebook) and that is the fight that Roger so well articulates
*a little liberal studies never hurts even in the STEM area
Given that Amazon was not the first online store, Google was not the first search engine, and Facebook not the first social network, there’s more to it than being first (although that helps).
thanks for the correction- a problem with a little literary latitude taken in a community focused on STEM materials.
The issue at hand, here, was well articulated in a previous posting on the question of whether there is author slicing and dicing to increase journal articles and the proliferation of these journals in encouragement of, or benefiting from this proclivity. Add the interest of users focusing on finding increasingly scattered materials and the increasing potential of AI and one starts to see increasing similarities to the music industry so well articulated in Roger’s well crafted analysis and Jim’s note.
This will, IMHO, flow to the social sciences. How this will impact on the smaller publishers still tied to the journal model, I refer to the Gatsby quote or perhaps a trenchant quote from TS Eliot’s “Little Giddings”.
Roger – In this posting your analysis of the researcher experience is framed in terms of “discovery and access”. But, as you know, the entire research life cycle is much more extensive. How much of the life cycle will the platform/continent eventually cover?
I absolutely think that’s a key question. As you know, I’ve gotten into broader workflow strategy in other pieces, but the component around current awareness (rather than “search”) is a really interesting chokepoint. This makes the steady integration of discovery and access into broader end-to-end workflow offerings especially important to track. Right now, I think we can see this most clearly developed by Elsevier, which is turning Mendeley into something of a dashboard for many components of the research workflow.
There are some good ideas in this article, but one that is conspicuously absent is a movement toward standardized, open metadata. If all publishers adhered to common metadata standards, and allowed metadata to be indexed and harvested by search engines, discovery would be greatly enhanced and the publishers’ platforms would be largely irrelevant to researchers. Authentication would still be an issue, but it’s one that can be addressed, again preferably by common and open standards. In sum, for publishers: Put more resources toward getting information to your potential readers and fewer toward locking people out of your systems. Librarians can help with this – we’re experts in metadata and we’ve been trying to get information in our users’ hands for centuries. It’s now wonder that researchers are looking to SciHub when the legal alternative looks a lot like a closed-stack, subscription library from times before I was born.
spot on Ross,
This is problematic when publishers have put pressure on the managers of journals or categories of journals which might see some more interdisciplinary journals stray into those managed by others. Again, the music industry has understood this. Users want the article or the “track” and not necessarily the journal or the album. Meta data, including abstracts and key words may be a short term solution or part of a key when entire cache’s of materials can be scanned down to the foot notes in an article.
Care must be taken to determine what is coming down the pike. Right now most of the solutions, well discussed by Roger, are trying to deal with what is/was and how we, as biocomputers, have had to deal with this.
There needs to be a gateway for research to enter into the distribution matrix, first for vetting and then for access by others. The journal has been that entry point. Hence we have seen a proliferation of “tunnels”, mostly controlled by publishers, through the wall, like the tunnels under borders, today. And we have seen that, once through the borders, distributors, both legitimate and the SciHubs are making this accessible.
If there is “meta data” attached to research materials, we now have the equivalent of common borders and 3rd parties become options for movement of these materials into circulation. There are organizations today that carry equal or greater credibility which could validate research. It does create some problems for those who rely on journal validation rather than the research itself. But, that should be easily defeased.
Excellent post, Roger. In this post and your past writings you’ve done a great job of illuminating many of the challenges facing our industry.
However, as one of the Co-Chairs of the RA21 initiative I would like to better understand your characterization of RA21 as a “band-aid solution”. I looked back your own article, “Meeting Researchers Where They Start,” which you referenced in this post. In that article you discussed the authentication and authorization challenges facing researchers as they try to navigate our ecosystem. In particular, in that article (from March 2015) you made several statements that now seem almost prescient when compared with RA21:
“Finally, it is time for a major commitment from the scholarly information ecosystem of libraries, publishers, university IT, and intermediaries…”
RA21 has active participate from more than 60 organizations, including libraries, publishers, university IT, intermediaries, identity federation operators, software providers, etc.
“…perhaps under the auspices of NFAIS, NISO, the Shibboleth Consortium, or another not-for-profit organization…”
RA21 exists under the auspices of NISO and STM.
“…provide authentication via a researcher’s institutional credentials…”
RA21 leverages a researcher’s institutional credentials.
Where RA21 has stopped short of your call from 2015 is to “develop a single user account for all scholarly e-resources.” Creating a single identity provider for all researchers throughout the globe would be a very substantial undertaking, especially given the emergence of privacy regulations like GDPR. Expanding RA21’s scope to include providing identities to researchers would not be incremental, it would be exponential. The RA21 team believes that we can address most, and perhaps all, of the authentication/authorization improvements you’ve sought without taking that step. But if we’re wrong, nothing that we’re pursuing with RA21 would preclude that expansion in the future.
Finally, I do not wish to oversell RA21 as the solution for all of the challenges you describe in this post. It is not “the” solution, but I firmly believe it is an important component.
Thanks Ralph. I have the deepest respect for the labor that is being put into trying to find a solution through RA21, and you may be right that one day it will grow into the solution that I believe we need, but today it is not that and has made no commitment to become that. In the piece you excerpted from, the full quote is this:
“Finally, it is time for a major commitment from the scholarly information ecosystem of libraries, publishers, university IT, and intermediaries, perhaps under the auspices of NFAIS, NISO, the Shibboleth Consortium, or another not-for-profit organization, to develop a single user account for all scholarly e-resources. This account would not only provide authentication via a researcher’s institutional credentials but also would be the vehicle through which a variety of additional data-driven services could be provided on an opt-in basis. The account itself as well as the data it contains would be under the control of the researcher, and it would therefore travel with the researcher when changing institutional affiliations.”
And in a later piece specifically on RA21, https://scholarlykitchen.sspnet.org/2018/01/22/identity-everything/ , I made clear my concerns that RA21 will instantiate winners and losers in the data and analytics game, rather than creating the level playing field that I hope to see develop. RA21 is a great model for the “majors” that I have profiled in today’s piece. It is not a great model for the publishing community overall, let alone for the research sector broadly.
I am sure that RA21 will go a great way to providing for increased security and anti-piracy efforts, and I am confident that it will make at least important incremental improvements in the user experience as well. But it is my view that it is impossible to hermetically seal off the larger strategic context here and focus only on anti-piracy and user experience.
Ralph’s post and Ross’ with my response seem to point out that moving towards alternative paths for vetting materials and accessing them could offer a potential threat to publishers in their current embodiment, but more than likely to journals as they now exist. Thus, you are correct, the larger publishers pushing for their platforms may prove, like Amazon, et.al. to dominate. It does present the possibility of a different business model where even select third parties may offer both wider gateways for research flow in and more relevant pathways for knowledge extraction.