The largest scholarly publishers today are driven by one major near-term strategic concern — to reduce leakage and thereby bolster the value of the subscription bundle. But while they work belatedly to address this priority through Seamless Access (RA21), GetFTR, and even partnerships with ResearchGate, the savviest of them are keeping their eyes on the true structural transformation that the internet has wrought. We are witnessing the transformation away from a journal-centric model of scholarly publishing towards a researcher-centric model of scholarly communication. Success in this new environment requires engagement with researcher identity, which is a struggle even for most of the largest publishing houses. Who is competing to own researcher identity and how can other publishers engage this vital role?
The researcher-centric model of scholarly communication has been emerging gradually and growing steadily. Researcher-centric scholarly communication enables collaboration, supports workflow, and provides personalization.
The evidence of the shift to a researcher-centric approach is everywhere. Discovery of the scholarly literature has moved away from browsing individual journal titles and towards searches, feeds, and alerts, among other mechanisms that are increasingly personalized and that look across the literature as a whole. Open access is driving publishers to focus on author relations when in the past library relations took greater primacy. Scholars are adopting an array of tools for managing the research enterprise, seeking tools that enable laboratory-based team science and cross-institutional collaborations, including lab notebooks and other types of sharing and collaboration spaces. Universities are looking to showcase scholars as universities compete with one another for research funding and assess scholars for their potential to contribute to this competition.
Each of these dynamics, and others like them, may seem to the publisher as requiring a slight adaptation, but collectively they suggest the emergence of a new type of model. This new model provides services based on extensive information about the researcher and the connections among individual researchers.
To be sure, the impact factor and the journal title remain important, more important than some would like. But, let’s acknowledge the reality that the actual version of record has declined in comparative importance. In its place we are seeing a growth in the importance of other kinds of research artifacts, not only preprints but also everything from datasets to protocols. It is possible to imagine these artifacts grouped tidily by existing publishers, titles, and articles — but the reality of the scientific process is that these artifacts are more likely project-based, laboratory-based, or researcher-based, not publisher-based.
At this point, we have not seen a complete transition away from journal-centric publishing. Instead, this is a hybrid period, perhaps one that will ultimately have been transitional, that includes traditional journal-centric publishing alongside these newer approaches that increasingly center on the researcher.
To compete in this emerging arena of researcher-centric tools, platforms, and analytics requires the ability to embrace researchers’ natural workflows and collaborations. And to do that requires that publishers and other providers find ways to center their work, in a technical sense, on the researchers themselves.
The key enabling factor in this transformation is researcher identity. Researcher identity, as I use the term, has two common elements.
First, there is a mechanism for individual researchers to express and ideally control aspects of their identity. This includes possessing some kind of user account, enabling data about the researcher’s interests and practices to be associated with them. The user account can ultimately be associated with a variety of tools and dashboards that are customized to them.
Second, there is the potential to link individual identities together in ways that express a network of one’s professional and scientific connections. This can include co-authors, laboratory members, collaborators elsewhere, others interested in the same research topics, and so forth. This social graph will typically be both inter-institutional and intra-institutional — put another way, it is largely non-institutional and certainly it is not publisher-specific.
Together, the user accounts and social graphs can follow a variety of formats and be subject to a variety of different controls. There are many different instances of standard format and control. For example, Google controls an enormous identity instance for all of its consumer Google accounts (enabling the use of Gmail, Google Docs, etc). At the same time, it also allows companies and schools to utilize their own identity instances for their employees and students through GSuite (enabling the use of Gmail, Docs, etc, in an institutionally controlled environment).
Identity instances enable the kind of researcher-centricity discussed above, and they can contain and manage an enormous amount of user data. As a result, the control of identity instances, and how, if at all, they interoperate is a strategic issue. It is possible for identity instances to follow open models and for multiple identity instances to be interoperable. Yet it is often the case that the organization that controls one identity instance feels responsibility for — or value in — maintaining a stronger degree of control.
As we now turn to reviewing some of the identity instances for researcher identity, it will become clear that what might be best for researchers is not the same as what is emerging in the marketplace.
Decentralization Serves Researchers
A decentralized approach to managing researcher identity would be in the best interest of researchers themselves. At its essence, a decentralized approach would provide user accounts controlled by the researchers themselves and made available to a platform provider on an opt-in and as-needed basis. I wrote in 2015 about the benefits of a single user account, providing a researcher identity instance that would be cross-university and cross-publisher, combining authorization as well as personalization. The technical architecture exists to develop this kind of research identity instance, as scholarly communication visionary Herbert Van de Sompel has recognized, but decentralization adds a challenging sociopolitical impediment.
BYU has led some efforts to bring a version of this approach into being for its students through what it has called a personal API. Unfortunately, we have yet to establish the standards and capabilities needed to enable decentralized researcher identity. The parties that would have been most likely to have seen this approach as aligned with their own values and interests — universities and their libraries — do not seem to see a strategic importance in taking leadership for researcher identity.
The Seamless Access (RA21) initiative builds on the institutional nature of identity management, through existing university controlled platforms. It ensures that, even though identity is everything, a truly researcher-centric alternative is not developed as part of the current efforts to address piracy.
Instead, identity instances for researchers are being developed in the marketplace. While no single provider covers 100% of scientists and other scholars, and while the potential for these identity instances and their control still remains unclear, it seems increasingly likely to rest with a profit-seeking corporation such as ResearchGate, Elsevier, or Clarivate.
To date, ResearchGate appears to be winning the battle to build a sector-wide identity instance for researchers, regardless of university or publisher, with Academia.edu as its primary competition. Though many have predicted the demise of these academic social networks, their continued growth cannot be dismissed.
ResearchGate reports having 15 million members worldwide. Some portion of these “members” are presumably inactive, or active only in limited ways. Even so, there is reason to believe that ResearchGate represents a substantial share of the global scientific community. While it is difficult to know exactly what its members are doing on the platform — anything from reading articles to engaging with collaborators to searching for jobs — the amount of traffic they generate is enormous. According to data from SimilarWeb, in a recent three month period, ResearchGate’s traffic was nearly equal to that of ScienceDirect, SpringerLink, and Nature.com combined. Or, to provide another comparative, ResearchGate’s usage was almost equivalent to that of a basket of major Elsevier properties, including ScienceDirect and all its other major STM properties, including Mendeley, bepress, SSRN, and Pure.
There is power to this scale. ResearchGate has been able to associate much of the scientific literature with its authors, enabling a variety of analytics that it is able to turn into services and in some cases to monetize. Even though ResearchGate is one of the largest sources of leakage and is therefore being sued by an array of the major publishing houses, the power of ResearchGate’s data has been sufficient to enable it to develop a partnership with Springer Nature, at least on a pilot basis, in which Springer Nature content is freely distributed on ResearchGate.
Of the primary publishers, Elsevier is the only one that has adopted a strategy that requires it to take on the role of managing researcher identity. Elsevier has acquired and developed an array of collaboration, discovery, analytics, showcasing, and assessment tools, including Pure, Mendeley, Hivebench, and bepress, among others. All of these are, to a greater or lesser degree, centered around the researchers themselves. As they are woven together into a workflow, with data and dashboards building connections across them, they require a single researcher identity instance. And for this reason, Elsevier has been steadily integrating these properties by combining user accounts at the identity and data layer even while maintaining distinctive brands.
There are some reasons to suspect that implementation is lagging on the integration side — specifically, Mendeley as the individual dashboard should be more interoperable with Pure as the institutional dashboard. But researcher identity is at the heart of Elsevier’s pivot beyond primary publishing and towards its future as what it calls an information analytics business. And, in that respect, no primary publisher has made greater inroads in researcher identity than Elsevier, and none has developed an identity instance as robust as Elsevier’s. This instance is a major asset that Elsevier is working to develop and prepared to defend. If other publishers could see fit to trust Elsevier to act neutrally with respect to its primary publishing business, it is possible that Elsevier’s identity instance could be offered as the basis for cross-publisher services offered by Elsevier (for example, publication services) or ones that could be offered by others.
Clarivate has taken a very different approach to researcher identity. On the one hand, Clarivate does not have the legacy of a primary publishing business. Yet, its legacy is equally tied to the journal-centricity of those businesses, through properties like its flagship Journal Impact Factor as well as ScholarOne. But the strategy that Clarivate pursued in recent years — though unclear how it will evolve following Annette Thomas’s departure — has also relied notably on researcher identity.
When Web of Science was still owned by Thomson Reuters, it created ResearcherID, a researcher identifier. After becoming a component of Clarivate, it purchased Publons, a service to provide credit to peer reviewers. More recently, it has merged the two together, creating a single dashboard for tracking one’s work as an author and reviewer across many publishers. Today, the Clarivate Web of Science group maintains a single identity instance enabling the use of its Publons, EndNote, and Web of Science properties. Over time, we may expect to see it combine Kopernio and other properties into this identity instance, enabling increased seamlessness on the user side. Given Clarivate’s positioning as “publisher neutral,” its identity instance could serve as the underpinning for a variety of cross-publisher initiatives that could over time challenge Elsevier’s efforts at analytics dominance.
While the corporate players continue to parry, ORCID represents a community based alternative that could grow from what it is now — principally a researcher identifier — into more of an identity instance for researchers. Indeed, it is already showing some evidence of this, providing social login support for other services.
It is possible to imagine the ORCID functionality being enhanced to become a more robust identity instance, covering not only authors and contributors but potentially a wider array of researchers and users and developing the elements of a social graph. Such an ORCID identity instance might offer centralized versions of certain core features. But, it might also adopt some of the core information ownership/control principles of the decentralized model discussed above. In such a scenario, it would allow its users to port their identity, on an opt-in basis, into a variety of services across the community — and remove their information from those services with equal ease. But, bearing in mind the current debates about CrossRef and its future directions, it is very difficult to imagine community members like Elsevier and Clarivate supporting ORCID expanding its role to become a full fledged research identity instance.
While ResearchGate may have developed a strong position as an identity instance for researchers, from a consumer perspective ResearchGate is a “niche” social network. That is to say, if one of the major consumer identity instances were to decide to develop its position in academic research, that could really change everything.
Google is an obvious candidate. An enormous number of researchers have accounts with Google’s consumer identity instance, and Google offers an array of services through its GSuite to many universities. Google Scholar is a very important discovery service for scientific research, while its Classroom has suggested a more recent willingness to develop educational tools and platforms. Its offerings may be too fractured, ultimately, to enable it to compete in what must be, from its perspective, a very small market. If it were to develop links between its GSuite for Education services and Google Scholar, that might be a sign of something afoot.
Facebook, like its philanthropic sibling CZI, is directly controlled by the Zuckerberg family. For this reason, it may be important to consider Facebook’s strengths as a social graph in combination with CZI’s acquisition of Meta and support of bioRxiv, each of which takes scholarly communication in the direction of researcher centricity. If Facebook’s enterprise product Facebook for Work, or something similar to it, begins to develop towards higher education and scientific research, that might be a sign of something afoot.
Other consumer players, including LinkedIn and Microsoft, have had less interest or less success in the scholarly communication space but could develop in these directions.
The control of researcher identity, and the management of the identity instances, should properly be seen as a major strategic dilemma for publishers, universities, and others. It is clear that the development of researcher-centric services has been hamstrung by too many publishers and other providers offering their own user accounts. Because these have not scaled, the nature of the services that can be offered remains limited. Network effects suggest we will over time draw down to a smaller number of stronger offerings for identity management. But whose interests will win out?
Perhaps the most important point of competition is between Elsevier and ResearchGate. Many publishers are in a battle royale against ResearchGate, angered by the leakage they see ResearchGate fostering. But Elsevier, which has a competing identity instance and researcher-centric set of services, has a unique rationale for leading the battle against ResearchGate — to promote its investment in analytics and defend against its most significant competition. In contrast, SpringerNature publicly, and others more privately, have examined opportunities to collaborate with ResearchGate. While no major publishing house would likely wish to rely on ResearchGate as the exclusive intermediary for its interactions with researchers, Elsevier has had a particular competitive rationale for pursuing the copyright litigation.
In this competition, however, it may be that the interests of the other major publishers, let alone the longer tail, are being ignored. None of them individually has the scale to create an alternative.
If they were to choose to do so, other publishers might be able to negotiate terms to use Elsevier’s identity instance. There are recurring rumors about ways that Elsevier has offered competing publishers opportunities to “plug into” its platform and analytics businesses. Is there a set of terms that could meet the business needs both of Elsevier and its publisher competitors?
On the other hand, it is not clear exactly what Clarivate intends in building a “publisher neutral” identity instance. In one way of thinking, Clarivate is building a portfolio of platforms, workflow, and analytics services that compete directly with Elsevier’s; i.e., Web of Science vs. Scopus; EndNote and Publons vs. Mendeley; Converis vs. Pure; ScholarOne vs. Aries. Is there a model in which Clarivate in essence becomes allied with all the major publishing houses other than Elsevier and its identity instance is shared with them?
ORCID faces the dilemmas of a poorly capitalized membership organization. As with CrossRef, many of the most exciting features that these community entities can build next might compete with one or more of their members. Can ORCID develop beyond a valuable identifier and towards more of an identity instance in ways that do not lead to unsustainable clashes with its members?
At the same time, the higher education sector remains absent from this landscape. The most prominent engagements from librarians about identity management have focused on opposing publisher efforts out of understandable concern for the protection of privacy and data security. But, we have seen no groundswell of effort to develop decentralized and/or community-controlled infrastructures to enable researcher-centric solutions.
If in the long run there is to be only one researcher identity instance, which will it be? And whose interests will it advance? Researchers strangely seem to have the least voice in the matter.
21 Thoughts on "Who Is Competing to Own Researcher Identity?"
Interesting blog post!
Frontiers “Loop” is yet another example of what they themselves call an “Open Science Research Network”.
This column has just given me a thought, probably not original, but I haven’t seen it much. The idea of researcher-centric rather than journal-centric may be the solution to the problem of predatory journals. I’m thinking of the growing problem of good scholarship being published in bad journals because the authors (often early-career) were tricked. If we can develop a reliable researcher-centric scholcomm system that provides to the readers information about the author’s reputation and credibility, it doesn’t really matter the quality of the journal their work is published in, especially if we can combine that with a post-publication peer review system that a lot of people are already talking about. At first it may sound like reputation/credibility of individuals is a concept fraught with difficulties in implementation, but the same might have been said for journals before we got used to impact factors and the like.
Curious, Melissa, what you think would signal the credibility and reliability if not where the author is publishing?
What do you think about this as a signal?
There are other trees for other disciplines, but basically these trace academic lineage (anyone for the 6 degrees of Linus Pauling game). It may be used as a signal of quality currently outside of the journal system, though it is certainly one that we used in the last century.
Another signal of quality may be something that we described in a previous scholarly kitchen blog (this one more closely related to the articles published): https://scholarlykitchen.sspnet.org/2019/12/18/guest-post-interesting-versus-true-measuring-transparency-and-reproducibility-of-biomedical-articles/#comments
Your proposal relies on the assumption that individuals always produce work of consistent quality. This is false. Sometimes, intelligent, well-informed people make mistakes. It’s the nature of working on the edge of existing knowledge to advance it further. That’s why we have expert editors and peer reviewers to vet new findings. The problem with predatory journals is that there is no expert review.
Another problem with the proposal is that it further disadvantages early career researchers. An early career author doesn’t really have a reputation or credibility yet. They can’t count on their name drawing readers. However, many early career researchers do excellent research and become respected authorities in their fields. This is why blind peer review is so important.
Impact factors are not a measure of credibility, just popularity.
Impact factors are a measure of the quality of the editorial enterprise, not of popularity. They say *nothing* about individual articles. They rate journals’ editors. This is what they set out to do, and this is what they accomplish. Criticisms of JIF (as here) typically charge them with something that they never set out to do. They are a highly specialized metric for a highly specialized area.
You’re absolutely right, but at the same time the publisher is disingenuous at best about this. Would Clarivate really sell as many subscriptions to JIF if it wasn’t being (improperly) used almost universally to evaluate individual researchers in hiring, tenure, and promotion? I seriously doubt it, and am saying that as a librarian who has served on our P&T committee for many years, and whose institution does NOT subscribe because our institution does not require JIF data in our faculty portfolios.
I agree that impact factors say nothing about individual articles, but I strongly disagree that they try to or accomplish rating journals’ editors. They measure how many times a journal gets cited in a given time period. That may be marginally influenced by the editor’s decisions, but the biggest influence is from the discipline. If you publish in a discipline where articles come out quickly, your journal will accumulate citations very quickly. If you publish in a discipline where articles come out slowly, your journal will accumulate citations very slowly. Within disciplines, what impact factor measures is whether people with strong research choose to submit there. That is popularity.
Think about how often editors change, but the hierarchy of journals in the field remains nearly identical!
The evidence is not on your side. You overlook two facts, and define “editors” too literally. In some disciplines there are publications with a high impact factor and pubs with a low one, so the discipline alone explains nothing. The consistency of JIF year by year points to a coherent editorial program. As for “editors,” the issue is not John or Sue but the history of the journal, its practices (and ownership), the network of readers and reviewers–everything that we mean when we reference “the brand.” The brand of the journal is the most important metric we have. Someone who wants to disrupt scholarly communications should understand this basic fact.
Its pretty ridiculous to criticize someone for using the literal definition of a word when you didn’t provide an alternate definition in your original assertion.
Of course there are disciplines with both high and low impact factors, however, some disciplines, where knowledge advances by accumulation of multiple viewpoints instead of building on prior findings (humanities and social science as opposed to sciences) have lower ceilings than others.
I agree that a coherent editorial program is a result of reputation with readers and reviewers. When highly competent scholars are willing to serve as an editor, that is evidence that the journal is trusted in the field.
Re: Joe’s “The brand of the journal is the most important metric we have.”. I don’t disagree, but many of us consider this a “bug, not a feature” of the current scholcomm ecosystem, and one that perpetuates a wide range of inequalities, from rapacious pricing to career-harming gender discrimination. I don’t think that anyone seriously involved in the OA or related “disruption” efforts fails to understand the iron grip that journal-title-level prestige has over most of academe.
We will be debating whether Journal brand is an important indicator of Article quality at the upcoming Researcher to Reader Conference.
That’s the most critical question, right? It might have different answers for different fields. Certainly I would expect institutional affiliation and rank to be a big part of it as so many related professional qualities would have gone into hiring, tenure, and promotion evaluation. In some STEM fields, being the PI for a major multi-year research lab may be too. I could see an important role for the scholarly societies to recommend their own guidelines. I can’t imagine it coming down to some kind of single linear scale (eg a one-number rating). Rather than reinvent the wheel though, let me turn the question around and ask the community, how is it that people are finding pre-print archives useful since those articles haven’t been published yet? What criteria are researchers using when deciding which pre-prints they’re going to read and cite on ArXiv etc? I would expect the answer to that would inform the answer to your question.
I guess that’s the thing … all those factors already are also signaling. So, why would where a person is publishing not continue to do so in a researcher centric world?
It might be helpful to consider what is in the researcher’s interest as, in these contexts, author or reader. What might drive their adoption of the various alternatives?
Thank you for an interesting article with useful links. Indeed, researchers can only really be “owned” to the extent that organizations successfully deliver services that they value.
I disagree that ORCID is insufficient. ORCID already provides everything (a persistent ID and a reliable public validation API) that is needed for a federated solution.
We were able to launch Recognito relying entirely on ORCID for access control and profile information. Rescognito has no “Register” button, yet more than seven million researchers can verifiably award/claim CRediT, complete checklists (such as Data Availability Statements) and recognize colleagues for their contributions (https://www.youtube.com/channel/UCQ5_BM5vrJjfoMZRWSBgCCw).
The technology is already here. Question is: do traditional organizations have enough imagination to use it?
When I think about identity providers, I think of things like:
LinkedIn: Professional identity
Yelp: Identities for businesses
Facebook: Personal identity
Research papers: ??? (I see pubmed and sciencedirect, mostly)
Researchers: ??? (I see LinkedIn as much as anything else)
None of these are Crossref/ORCID-like entities. What they have in common is that they have the best access to the data necessary to populate the profiles. LinkedIn has tons of resumes and other professional metadata, Amazon has book listings and associated metadata, etc. They are the sites that best answer the question of what link to send if you’re going to send a link about a thing.
It makes sense that the providers of academic identity which best answer the question of what link to send if you’re going to send a link about a researcher would be the providers who have the best data to populate the profiles.
Here’s where the conversation gets confused, though. “What is the unique name for this thing?” and “What’s the best link to send about this thing?” are different questions. I don’t think identifiers themselves (ORCIDs and DOIs, etc) should be minted and assigned by the biggest user of the IDs (imagine Google being responsible for assigning domain names!), but nor does it make sense to expect the ID providers to be the ones which are also the best links to send about a thing, because in order to do that, they’ll have to collect information from all over the web about everywhere that ID was used. Ask the people working on Crossref Event Data how that’s working out for them!
So I think Roger is spot on as usual about the landscape and dynamics, but for the purposes of discussion of publisher centric vs library centric vs researcher centric, we’ll have better discussions if we keep the distinction between an identifier and a service using that identifier in mind.
As the article suggests, Elsevier is coming at this in a comprehensive way, leading me to wonder if they could win the war all on their own, in the way Thomson/West won the legal publishing war. (Note I am not rooting for this, just noting the possibility.) Thomson/West did this through massive investment in technology over several decades. I remember being at a talk from the then-CEO of Thomson soon after the merger with West in 1996. He pointed out that Thomson already had 2000 technical people on staff and the number had just exceeded the number of publishing personnel.
It’s all about scale and owning a growing percentage of the user’s day. This is precisely what Thomson/West does and what companies like Elsevier and Wolters Kluwer do on the clinical solutions side of the business. They own growing portions of the lawyers’ and doctors’ day. Again, not rooting for it, but Elsevier could do just that.
Food for thought:
Novel open science platforms inject neoliberal images of the marketplace of ideas into the scientific community, where participants may not have paid much attention to contemporary political economy. For instance, the programs are all besotted with the notion of complete identification of the individual as the locus of knowledge production, to the extent of imposing a unique online identifier for each participant, which links records across the platform and modular projects. The communal character of scientific research is summarily banished. The new model scientist should be building their ‘human capital’ by flitting from one research project to the next. That scientist is introduced to a quasi-market that constantly monitors their ‘net worth’ through a range of metrics, scores and indicators: h-index, impact factors, peer contacts, network affiliations, and the like. Regular email notifications keep nagging you to internalize these validations, and learn how to game them to your advantage. No direct managerial presence is required, because one automatically learns to internalize these seemingly objective market-like valuations, and to abjure (say) a tenacious belief in a set of ideas, or a particular research program. All it takes is a little nudge from your friendly online robot.
One minor thing to add to this interesting debate, Facebook, Google, and LinkedIn have weak authentication. Not in terms of creating usernames and passwords, but in proving that you are who you say you are. It is a real weakness in using these identities for academic purposes, particularly in assigning ownership or authorship.
Regarding ResearchGate, the academics who reach out to me all seem to want me to give my papers to them for free, despite copies being available through their institution. I think this goes back to discoverability again (ResearchGate is like a bad penny in sending out emails of “you read this, hence you will like this”).
This is where member societies can come in (because you have to prove you are who you say you are to them). They have the ability to give credentials, all they need to do is decide which services to authenticate.