The scholarly publishing sector has struggled to address the problems that users face in their discovery-to-access workflow and thereby stave off skyrocketing piracy. The top-line impact of these struggles is becoming clearer, starting with Elsevier’s absence from Germany. This makes the efforts to establish seamless single-platform access to all scholarly publications — equal in extent as Sci-Hub but legitimate, and which I term a Supercontinent of Scholarly Publishing — all the more urgent. The technical solutions are challenging, and at the STM meeting in Frankfurt last week it became clear that, although progress is being made, policy, governance, and competition issues may complicate the drive to consensus. As the major publishers pursue their respective strategic objectives, do their interests align sufficiently to make a radical break on content distribution and syndicate their content to all manner of access platforms?
Distributed Usage Logging
Publishers already distribute content to other platforms, such as aggregators, but these are fairly labor-intensive and are generally undertaken in exchange for a fee. An alternative, comprised of two critical components, is in development today. First, it requires an ability to authorize appropriate access in a decentralized distribution environment. A Shared Entitlements System, as it is sometimes called, would be a kind of common authorization service for all publishers. As I will discuss below, there are at least two options for how Entitlements can be addressed. Second, it requires Distributed Usage Logging, which is to say the ability for all usage, wherever it takes place, to be “counted” in measuring the value of articles on behalf of authors and licenses on behalf of publishers.
Crossref and a group of early adopters last week announced that they are launching a Distributed Usage Logging (DUL) system. Elsevier is participating as both a platform (through Mendeley) and a publisher, Digital Science as a platform on behalf of several of its services, and Atypon on behalf of its publisher customers. This is a critical step in enabling the content syndication necessary to create the Supercontinent and other alternatives to traditional publisher content access platforms.
As I considered these issues, I spoke with Chris Shillum, Vice President Identity and Platform Strategy at Elsevier, and Rob McGrath, CEO of Digital Science’s ReadCube, to learn about some of the approaches that the group is taking to DUL. It will mostly be direct platform-to-publisher communication, an intentionally lightweight approach to central infrastructure, matching content to publishers using CrossRef’s lookup service.
There are several matters yet to be worked out. While the COUNTER code of practice will serve as the baseline standard, some reporting metric definitions appear to be somewhat ambiguous or subject to multiple interpretations, so some definitions may require revision. The CrossRef board has not yet approved DUL as a production service, so issues of fee structure and participation requirements are not yet settled. When I asked Shillum whether Sci-Hub would be permitted to send usage data to publishers through DUL, he countered my provocation saying that, while “rules of participation” haven’t been established, DUL “is intended to encourage good behavior.”
There are two primary beneficiaries of DUL, authors and publishers. Authors of articles whose publishers are participating should expect to see usage-based alt-metrics scores rise, which will put competitive pressure on all publishers to participate. And participating publishers will see their usage numbers rise. Elsevier estimated that its ScienceDirect usage could rise 5-6% as a result of DUL capturing usage from its Mendeley and SSRN platforms alone — let alone other platforms (slide 10 here). This in turn may, depending on how COUNTER reports are interpreted, drive down the “cost per use” metric that is simplistically but regularly used to estimate the value of library license agreements. Libraries and publishers alike should prepare themselves to negotiate bearing in mind this change.
One use case that DUL does not address is the authors’ need to have a dashboard of all their articles, with impact and usage metrics, conveniently accessible from a single interface. A reverse DUL, bringing article-level data back from publishers to scholarly collaboration networks, would be an interesting next step to contemplate.
Broadly, DUL as it is being implemented seems to be in every publisher’s interests and should become the norm rapidly. From discussions with leaders at Springer Nature and Wiley, I am confident that the underlying DUL approach has broad support and will be widely adopted by other large publishers without delay.
Shared Entitlements System
But DUL alone does not improve content access. The other component of content syndication, the Shared Entitlements System (SES), is taking longer to develop. While everyone with whom I have spoken from the publisher community is optimistic that they can make progress on SES, there are some underlying strategic dilemmas here that leave me somewhat less confident that full content syndication to access platforms — the Supercontinent — can be achieved soon.
The vision for SES is to enable an access platform to determine whether any given user is authorized to access the content item they are requesting. This requires a service to determine what, if any, institutional affiliations a given user may have and then a mechanism to determine what licensed “entitlements” each institution should afford the user. This is made more complicated by the fact that many users have multiple institutional affiliations, which is often overlooked to the user’s disadvantage.
There are fundamentally two basic models for how SES is implemented.
- Upon being “approved” as entitled, a user could be routed to the publisher site for seamless article access. In this scenario, SES is little more than an improved linking experience. From my perspective, improved linking is the less ambitious option, and likely to meet a far smaller share of user needs.
- Or, instead, the platform could provide access directly on its site and through DUL ensure that appropriate “credit” is provided. The latter option would require some kind of mechanism for content distribution to platform partners — content syndication — which in turn requires publishers to place a much higher degree of trust in the platform provider. Even moreso, content syndication requires that publishers abandon the idea that they can capture all the value on their own sites and instead radically improve the distribution system for their publications. For users, content syndication is the more valuable approach, because it can ultimately provide for them most of the benefits of having a single user account for all scholarship.
A year ago at STM, Elsevier’s Wouter Haak presented a roadmap for addressing access issues that included an Entitlement API that was to be developed in 2018 (slide 8 here). Behind the scenes, an STM working group was able to make a good deal of progress, following the model that article access would be fulfilled from the publisher site. Since then, progress appears to have slowed, as the publishers and platforms focused on DUL. DUL is essential to the more valuable, more ambitious, Supercontinent approach.
At the STM meeting last week, it appeared that at least some SES conversations were renewing. In his keynote at the STM meeting in Frankfurt, Springer Nature CEO Daniel Ropers made clear that he thought developing a shared entitlement system is well overdue. Wiley CEO Brian Napack said that Wiley is working on a shared entitlements system. The extent to which these initiatives are coordinated across publishers is not clear.
In response to my query about Elsevier’s role in entitlements, Gaby Appleton, Managing Director, Elsevier Research Products, emphasized the major investments that have been made in the past year on DUL, RA21, and Version of Record tagging. “The STM reference group on sharing started the discussion on entitlement standardization last year. We’re very much involved in this and support its direction…Now that DUL is pretty much finished from an engineering point of view, we can turn our attention back to the reference group discussion on entitlement and work out what needs to be done. We absolutely do want to support this.”
And, Todd Toler, VP Digital Product Management at Wiley told me, “The houses need to work towards a standardized approach to an entitlements API, because it’s long overdue and the only route to extend the seamless access promise of RA21 to a wide range of discovery options and not just our own platforms.”
At a fundamental vision level, there seems to be broad alignment among the publishers that it is time to address entitlements. But it is far less clear to me whether there is alignment on whether to pursue improved linking alone or enable content syndication as well. No one would speak with me on the record about the specific entitlement approaches that their companies support.
Content syndication represents not only an important opportunity but also potentially a major risk for any given publisher. And the ways to control those risks are not just at the technical working group level, but in terms of policy and governance. For example, publishers have a major strategic interest in being able to control which platforms can become the distribution agents for their syndicated content and under what terms. Will payments be exchanged in either direction (for example, in those cases where the publisher can be expected to lose out of advertising revenue)? Will there be limits on what services the access platforms can offer? Or requirements on data sharing that go deeper that COUNTER-compliant DUL reporting? If bilateral agreements are needed between each publisher and each platform, that will minimize the benefits of SES, but some publishers might nevertheless favor bilateral agreements that allow them to maximize the value they can generate from the access platforms. If not through bilateral agreements, how will distribution rights be awarded and through what kind of common governance process? In the environment that could emerge, not only for subscription content through entitlements but also for the growing share of open access content, will any legitimate business be able to serve as a content access platform, allowing for true platform competition? If publishers are to move to content syndication, there will be important policy and governance matters to address.
But content syndication has not been inevitable as the entitlements approach. When the Springer Nature CEO spoke last week, he also intriguingly invoked the major split about how to engage or prosecute ResearchGate, explaining that it was a consequence of the fact that in the past major publishers “weren’t clear on the end-game for entitlements.” Are they clear now?
The End Game
Scholarly publishing leaders are clear that one benefit of this new infrastructure is to enable the sector to “build new services on a competitive basis,” as Elsevier’s Shillum explained to me. Depending on how SES is implemented — if it enables true content syndication onto access platforms with DUL back to publishers, and not just improved links to publisher sites — it will have the effect of dramatically reducing the barriers to entry of creating a legitimate multi-publisher content access platform.
In this arena, the key strategic question is: will we see syndicated SES or just improved linking? Under syndicated SES, we may expect the competition around creating access platforms to grow and the business models that can be associated with them to change dramatically. This could be very good indeed for publishers.
Nonetheless, a major publisher would see that there are dilemmas associated with enabling the creation of these new businesses. For one thing, there could be seen to be an opportunity to turn every publisher site into a Supercontinent, so I asked Henning Schoenenberger, Springer Nature’s Director of Product Data and Metadata, whether publishers will eventually distribute competitors’ content through their own sites, i.e., turning SpringerLink into a multi-publisher content access platform. He said, “Springer Nature can’t speak for other publishers but fully intends to continue to invest in its own platforms (ie SpringerLink and Nature.com) and will continue to develop services and tools that will add value to the specific content that we publish.”
I wondered about some additional risks. I asked Schoenenberger if he was concerned about giving an analytics advantage to platform providers who will have more granular usage data than publishers. He said, “You can say that there is some risk that we open up some advantage to a competitor but this is an industry initiative to facilitate access and transparency in research. That balance is well done.” From conversations with other publishers as well, I am convinced that there is broad agreement that an appropriate balance can be struck.
At the same time, the split on ResearchGate that Ropers invoked, and which continues today, has me puzzling over what would constitute a level playing field for content syndication. After all, the platforms with the greatest comparative advantage from the outset are probably those with the greatest starting point traffic, which almost certainly positions Google Scholar and ResearchGate quite favorably. While much has certainly changed since 2014, at the time it was clear that Google Scholar and ResearchGate were the most powerful platforms in the scientific publishing sector. When STM looked into the matter, it found that by 2017 at least by one measure ResearchGate was well ahead of other scholarly collaboration networks (slide 5 here). ResearchGate itself has been estimated to have greater traffic than any publisher site. Might we expect to see a scenario in which Google Scholar or ResearchGate capitalizes on its position by turning into a syndicated access platform?
Thus, we come to the question of which companies benefit from enabling what forms of SES. This in turn causes me to raise questions about Elsevier, the one company that is both a publisher and potentially an access platform for others, assuming Springer Nature continues to operate independently from its half sibling Digital Science, even after its failed IPO.
Elsevier as a publisher surely benefits from having its content distributed more broadly and therefore access and usage expanded. But Elsevier increasingly sees itself as “a global information analytics business.” To fulfill this role, Elsevier as a platform company will surely hope to gain access to far more usage data through Mendeley, Scopus, SSRN, and bepress, than it loses to third party platforms. Put another way: if ResearchGate originates as much traffic as it appears to do, would Elsevier lose access to substantially more usage data from SES syndication than it stands to gain? And if so, does this threaten in any way its interests in becoming an analytics provider? The company’s strategists are surely grappling with these tradeoffs.
Ultimately, leading scholarly publishers appear to be acknowledging the reality they face: the content they publish has begun to be decoupled from the distribution systems they control. In order to stop users from moving to pirate sites, publishers may need to abandon the idea that they can capture all the value on their own sites and instead radically improve the distribution system for their publications. Through SES syndication, an array of access platforms can enable content usage that is limited to licensed user communities, and through DUL that activity can redound to the benefit of the authors and publishers responsible for producing and distributing it. SES syndication and DUL are the necessary ingredients to creating the true “supercontinents of scholarly publishing,” as I have called them, one-stop locations where all scholarship can be discovered and accessed seamlessly. Is content syndication the end game?
I am grateful for the discussions that took place at STM/Frankfurt and a series of subsequent interviews and email exchanges that I was able to have on these topics. I thank Gaby Appleton, Lisa Janicke Hinchliffe, Rob McGrath, Tom Reller, Henning Schoenenberger, Chris Shillum, Todd Toler, David Tucker, Susie Winter, David Worlock, and others, for their help.
36 Thoughts on "Will Publishers Syndicate Their Content?"
Before worrying about supercontinents and so forth, I do wish Elsevier would provide publishers with usage data from Scopus.
Elsevier receives access to all our content free of charge, for indexing in Scopus: we are happy to provide that service because we know that many students and researchers worldwide rely on Scopus as the starting point for their research. Unfortunately Elsevier refuse to provide us with usage information to show the value of our investment in their service. While granular statistics at the institution level may not be feasible, even a top-line indication of the usage our content is receiving on such a major platform would be valuable.
You might say we should subscribe to receive that information: in March 2018 I was quoted more than £9,200 per year for a Scopus subscription. I appreciate that Scopus is a commercial product but this amount is excessive and unaffordable for a small not-for-profit publisher.
Don’t even get me started on the lack of accuracy in their indexing…
It’s the same argument we get from researchers – we provide content for free and then have to pay to see the results. Roger notes: Publishers already distribute content to other platforms, such as aggregators, but these are fairly labor-intensive and are generally undertaken in exchange for a fee.” In my experience, it’s the aggregators who charge the fee and the publishers get nothing. The aggregators then sell the product to the institutions and make a profit in that way. EBSCO and ProQuest have been doing this for years. Given the ubiquity of Google Scholar, who needs any of these services?
Are you suggesting that EBSCO and ProQuest don’t pay publishers for the inclusion of journals in their packages? My experience has been that those aggregators are valued sources of revenue for the publishers with whom they work.
Both EBSCO and ProQuest maintain large teams of publisher relations staff that have negotiated with publishers for content for years. Anyone who thinks that publisher give their content for free is very mistaken. Both companies provide a significant revenue to publishers. I am always amazed at how much misinformation about our industry is commonplace.
I’ve worked for three society publishers and none of those has managed to extract any compensation from either EBSCO or ProQuest. Would you care to share your secret?
Tasha, I hear what you are saying about wanting the first system to work well for you before considering whether to participate in the vision for the future! If I understand what you are looking for correctly, the basic usage data sharing is exactly the use case for Distributed Usage Logging. I am not certain whether or when Scopus will participate, given that the current COUNTER 5 statistics are focused on article sharing and usage, which I don’t believe are currently possible through Scopus. That said, if and when syndication begins, that will be a different story altogether. Given that the standards being developed here are being driven by the largest publishers, scholarly societies and other small and medium size publishers may want to dig in to the specifics to see if they are in their interests as well. – Roger
In the interests of transparency: I was on the DUL working group for a short period, before realising that there was no feasible way for us to properly engage with them short of hiring a developer. I love the idea of DUL, but am struggling to see why a light weight alternative could not have been put in place by aggregators years ago. PubMed Central manage it, as do platforms like ScienceOpen. – Tasha
Tasha – I am not sure what “usage” means in the context of Scopus. Users typically use search and discovery tools like Scopus to find articles of interest, and then link to the publisher site via the included full text links to “use” the content. This usage in turn can be measured on the publisher’s platform itself, and classified by source by looking at Referring URLs. Maybe I am missing a use case?
Simple abstract views would be a great start. Number of times content appears in search results would also be useful. At present, we can’t tell if our content actually shows up / gets used on Scopus at all – but we have to assume that it does, or why would you want it? – Tasha
>> At present, we can’t tell if our content actually shows up / gets used on Scopus at all
Are you looking at the inbound traffic in your web analytics?
At the moment DUL is focused on interchanging data about full text usage, but from a technical point of view, there is no reason why the same mechanism couldn’t also be extended to look at abstract views.
We do track click throughs from Scopus, of course, but are under no illusions that every user who views a record of our content in the Scopus database will click through to read the full text. We want to know the number of record views so that we can then extrapolate the click through rate. This does not need to be through DUL. – Tasha
On a much smaller scale, and in the humanities instead of the sciences, university presses like ours at Penn State quickly realized that it made no sense to supply our journal content through our own web site, but had to work with an aggregator like Project Muse to survive at all in the new e-publishing space. Could not Muse become the supercontinent for the humanities? (I wont even begin to ask how monographs fit into this vision.)
This already sounds too complicated to be implemented, but are there not some missing pieces to the puzzle?
Could publishers deliver a ‘supercontinent’ without also agreeing on terms of service, and the definition of networks (eg whether license can be granted to a department or has to be to whole institution). Not to mention off-site access and consortia deals and all the other arcana of current systems.
And would not this supercontinent also throw into relief some very awkward questions about differential pricing from the same publisher to different institutions?
SciHub and Plan S are creating very messy problems, but this cure might be as bad as the disease.
Adam, The idea behind the Supercontinent is that the publishers continue to sell to organizational customers, according to what terms they can mutually agree upon. The Supercontinent provider, through user-level information about entitlement, makes the content available. You may be correct that developing a standard to make this possible would be impossible. We will see!
How is this suggestion different from a centralized research search engine that’s connected with RA21, which gives the entitlement information (e.g. Kopernio)? Or are you instead suggesting one central hub that hosts all the content? What is the advantage of the latter over the former? Wouldn’t the latter just drive monopoly lock-in, and essentially put publishers in the same place that Michael Clarke recently suggested that research societies are in (https://scholarlykitchen.sspnet.org/2018/10/04/navigating-the-big-deal/)?
David, the “syndication” model can yield what you call “one central hub that hosts all the content.” Of course it can also be used to create other “slices” of the landscape, such as those focused on research for specific fields, disciplines, world regions, etc. The advantage of putting all the content on one platform — which a given user can then have as their default starting point and research platform — include true seamlessness from a user perspective, as well as the development of a central set of usage data that can be built into services valuable to that user. I agree that there can be risks and tradeoffs as well.
Which seems like more consolidation of both power and profits. It’s fascinating how the digital world seems to lock things down into one property — one Google, one Facebook, etc. I can’t see scholarly communications being all that different from every other field, so I would doubt there would be lots of slices rather than one go-to resource. Personally, I don’t think this is a good thing for anyone other than that one winner.
Essentially you’re asking for publishers to turn their business over to a competitor, to give up all advertising revenue and any connection to or data about their users, and morph into the sales wing of a bigger publisher/platform owner. You’d lose all your branding, and as noted above, you’d essentially be locking yourself into being a part of someone else’s product, with the continual reduction in value of your offerings that creates.
“I would doubt there would be lots of slices rather than one go-to resource. Personally, I don’t think this is a good thing for anyone other than that one winner.”
Perhaps so. As with the consumer internet, it turns out that what is in the interests of the users is often not in the interests of the information businesses that were heretofore most important.
I would rephrase that as “it turns out that what is in the interests of the the new information businesses is often not in the interests of the users, nor of the information businesses that were heretofore most important.”
Just to make sure that it is perfectly clear, I view the “Supercontinent” approach as decidedly second best to the “Single User Account” that I proposed several years ago (https://scholarlykitchen.sspnet.org/2015/07/29/a-single-user-account/). The Single User Account would have enabled seamless login across all publisher sites and allowed the user to control their own data. But, several of the major publishers were at the time focused on efforts to control detailed usage data for their own strategic purposes, so instead they turned to RA21, looking for a model to retain their platform dominance, such as it was.
Now, they are faced with an even more difficult dilemma. Yes, RA21 may improve off-site access, but that is only a piece of the puzzle. And some publishers now understand that “legitimate” avenues of access are far more broken than RA21 alone can address. If content providers cannot come together to create a neutrally provided “Single User Account,” they will be forced to come up with other means for addressing user needs if they want to forestall pirate and other supposedly illegitimate platforms. And hence instead of a user-controlled account, which would have levelled the playing field, we may instead end up with one or a small number of Supercontinent sites which, like iTunes at first and then Spotify, serve at least for a time as a “master switch” for an entire sector.
It would also be highly revealing to each company which institutions subscribe to the competing content it seems? Which isn’t really a secret right now but this would centralize the information rather than having to visit library websites. Personally, it would be really amazing to see this as a librarian/library but I fear this info would be available to the publishers (sellers) and not the buyers (libraries).
Lisa, I’ve been puzzling over why an entitlements system would have to reveal which institutions subscribe to a competitors’ content. I suppose it would reveal this to platform providers, but presumably limits could be imposed on how this API could be used and to what purposes?
“True seamlessness from a user perspective” is does not appear possible unless the content that is searched is artificially limited to what the user has entitlements (which I would note is what many libraries attempt to do with their discovery layers) or unless users have entitlements to all content (open access). It looks to me that in this syndicated model, the user may get (I say may because I have yet to see that RA21 gives the promised seamless access and this is far more complicated and complex) seamless access to what they are entitled to but s/he will hit an absolute roadblock (or, more likely, an offer to purchase?) when s/he is not entitled. Worse than the library discovery service where we can offer up open access (Green OA) options or interlibrary loan/document delivery. This may streamline subscriptions but I am skeptical that it battles down the threats from SciHub, ResearchGate, etc.
Lisa, I’m sure the syndicated Supercontinent does not fix every problem for every user. No doubt about it. I also supposed one could run Kopernio etc on top of the Supercontinent, leaving users no worse off than they currently find themselves in terms of access, but with tremendous benefits in terms of discovery, personalization, recommendations, etc. As long as enough researchers are keep inside the “legitimate” ecosystem, publishers can be content. By the way, I cannot recall seeing Green OA well integrated into a library discovery environment, can you point to a good implementation?
The alternative of course would be to go to a fully gold open system and dispense with the technical complexity and all the cost it adds to the system. With Plan S something like that is going to have to happen sooner rather than later.
David, Even more so in gold open access, the question about what platform will serve as the starting point, and therefore have a chance to monetize usage activities as Lisa suggested the other day (https://scholarlykitchen.sspnet.org/2018/10/11/from-paywall-to-datawall/), seems critical.
Perhaps this could be handled at the API level. And, perhaps it is mostly an issue when a publisher is also a platform provider. Which of course means Elsevier most obviously. But, if Springer Nature/Digital Science comes closer together … that’s another.
CalTech is the one that comes to mind – https://awayofhappening.wordpress.com/2018/04/26/the-future-of-library-access-open-access-linking-and-hybrid-interlibrary-loan/#more-4226 – I recall seeing that some have also included unpaywall in the link resolver but I can’t remember which…
Thanks for your post, Roger. Somewhat along the lines of the DUL but going further, for the past year, the Canadian Association of Learned Journals has been engaged in developing a Readership Analytics Project designed to serve SSH journals and researchers. Our aim was to bring together usage data from the journal’s own website and from various agencies including Erudit, Project Muse, JSTOR, Proquest and Ebsco and convert the resulting spreadsheet data into useful tables and figures. By facilitating the examination of usage data we reasoned that journals and researchers would benefit from knowing, for instance, which articles attracts what level of traffic, from which locations of user, via which data source, with what delay following publication, and over what period of time. We were delighted to receive cooperation from the various agencies named. A secondary aim was to provide journals with the compiled (and separable) data rather than create a dependency on larger entities who, of course, have their own interests. We have achieved a certain level of sophistication and would be very open to collaborating with other organizations serving small journals to take the tools we have developed forward.
By the way, re secondary aggregators, many small journals receive no revenue or next to none from substantial usage, say 100,000 hits. No doubt that is not the case for the larger publishers and journals.
Roger, What about https://www.dimensions.ai/? Do you consider it as Supercontinent with 97,717,837 Publications?
I wrote be about Dimensions in my Supercontinent piece — https://scholarlykitchen.sspnet.org/2018/05/03/supercontinent-scholarly-publishing/ — and because of the ReadCube integration that makes content available on sure yes I do see Dimensions as a key early player in this space.