It feels increasingly like the adoption of persistent identifiers (PIDs) and, critically, the metadata associated with them, are at a tipping point. This is a long-awaited moment for PID enthusiasts like me, but it begs a number of questions: Why are PIDs important? What’s causing the increase in interest? And what does it mean for the wider research sector — including publishers — both nationally and internationally?

My MoreBrains colleague, Phill Jones, addressed some of the reasons why in his recent post, “Unnecessary Research Bureaucracy is Killing Academic Productivity, But it IS Fixable“, in which he highlighted the massive amount of time that researchers waste doing admin tasks, which they could otherwise spend doing actual research to help solve our very real world problems. Not to mention, their frustration at that waste of time and, all too often, the mental health consequences of that frustration. As he notes, PIDs could be part of the solution:

“Integrations of funder, institutional, and publisher systems into PID authentication systems like ORCID single sign-on and metadata registries won’t remove all administrative burden, but they will eliminate much of it, particularly with assurance and accreditation requirements.”

Benjamin Franklin famously noted, “Time is money.” So it’s not surprising that funders, governments, and others have started paying attention to recent discussions about how improved research infrastructure could save researchers time. Until fairly recently, however, evidence of this (unsustainable) bureaucratic burden has been mostly anecdotal. So it’s been very gratifying to work on a couple of cost benefit analyses that bring more rigor to the study of this problem. MoreBrains’ recently updated an analysis we performed for Jisc, which estimates that researchers and administrators in the UK waste around 55,000 person days a year (equivalent to £19 million/US$23 million) rekeying information into university systems. A similar piece of work for the Australian Research Data Commons (ARDC) and the Australian Access Federation (AAF) in 2022, found similar levels of waste (38,000 person days, equivalent to AU$24 million/US$16 million). Adopting PIDs and integrating them into the systems used throughout the research process (grant applications, manuscript submissions, repositories, etc.) would allow those wasted hours to be redirected to actual research.

 

There’s no denying that money talks, and being able to demonstrate the actual financial value of widespread PID adoption and use is definitely one way to get the immediate attention of policy-makers and other decision-makers. During a recent webinar about the findings of the Australian report, Natasha Simons of ARDC noted that the response from Australian research leaders, funders, and policy-makers was “Wow, those figures are really impressive!”

chalkboard drawing connecting time (a clock) and money (a dollar symbol)

However, for many funders and policy-makers, it’s equally — if not more — important to ensure that the research they support, whether directly or indirectly, is open and FAIR (Findable, Accessible, Interoperable, and Reusable). There’s increasing recognition among these groups that wider adoption of PIDs has the potential to play a critical role in increasing openness and transparency. And open PIDs, in particular (typically those that are community-governed, and have openly available metadata), also support FAIRness, as my former ORCID colleague, Tom Demeranville has pointed out:

 “FAIR PIDs … are not just resolvable, but can also be used to discover open, interoperable, well-defined metadata containing provenance information in a predictable manner.  They are openly governed for the benefit of the community … and the attached metadata is available under a CC0 license, meaning that it is open to everyone. The metadata contains information about the publisher, the publication, other authors, funding, and affiliation(s), all of which help establish the provenance of the item.”

Last year’s White House Office of Science and Technology Policy (OSTP) Nelson Memo is just one recent example of a national funding organization that is paying attention to PIDs. It directs US agencies to instruct their funded researchers “to obtain a digital persistent identifier … include it in published research outputs when available, and provide federal agencies with the metadata associated with all published research outputs they produce”. Other examples include UK Research and Innovation’s (UKRI) recently updated open access policy, which states that “Persistent Identifiers (PIDs) for articles must be implemented according to international recognised standards”; and Plan S’s requirement for the “Use of persistent identifiers (PIDs) for scholarly publications (with versioning, for example, in case of revisions), such as DOI”, which has been adopted by multiple countries.

It’s not just the national funders who are getting in on the act; there’s also been a surge in interest at the national government level. A number of countries in the Americas, Asia Pacific, and Europe are at various stages of developing and implementing national PID strategies. They include Australia, Brazil, Canada, Finland, the Netherlands, Peru, South Korea, and the UK, all of which are participating in a Research Data Alliance (RDA) National PID Strategies Working Group, set up following a Birds of a Feather session at the RDA Virtual Plenary 17 last year. There are a number of similarities between these countries’ approaches, as the RDA WG has found. Its aim is “to map common activities across national agencies/efforts and produce a guide on the specific PIDs adopted in the context of national or regional PID strategies [in order to] help others, irrespective of geographical region, follow a blueprint to define their national PID approach. The intention is that it can be adopted or adapted by other countries looking to develop their own PID strategies. By following the recommendations it will encourage standardisation internationally.” One element of this work is to identify the most commonly used PIDs across all countries, which I’m sure is music to the ears of my former NISO colleague Todd Carpenter, who pointed out in his recent post that, “It is past time that we all agree on a core set of identifiers and basic metadata elements and begin to encourage researchers to use them at scale when communicating their results.” Common PIDs (not all of which are open) that have already been identified in the RDA WG’s work include: ORCID or ISNI for researchers; ROR or ISNI for research organizations; Crossref DOIs for research articles; DataCite DOIs or Handles for research data; Crossref DOIs for grants; RAiD for projects; and DOIs, IGSN and RRID for samples and specimens.

While there are many similarities in approach and intention between the countries participating in the RDA group, there are also many differences between each country. The research endeavor is global, but research policy and funding are largely executed on a national level — and, of course, other local needs must also be addressed in any national PID strategy. In Canada, for example, this includes providing equitable support for both English- and French-language researchers. In both Canada and Australia, there’s increasing recognition of the value of Indigenous knowledge and a corresponding commitment to providing Indigenous researchers with an infrastructure that addresses their (often very different) requirements. Australia and the UK both have national research evaluation exercises, which could be transformed through an expanded use of PIDs including, for example, enabling the easy inclusion of practice-based research outputs.

A huge, but often overlooked, advantage of taking a national approach is the opportunity it provides to bring the whole community together, involving stakeholders from right across the research ecosystem — publishers and vendors, as well as funders, institutions, and of course researchers themselves. In fact, it’s absolutely critical to ensure that all groups are engaged in the discussions, decision-making, and rollout because, as noted in the Australian report:

“for the greatest benefit, a high proportion of universities and other institutions need to invest in integrations and adopt PIDs as part of their workflows—PID adoption is a collective action problem requiring both community organisation and collective investment” (my highlight)

In other words, no one benefits until everyone benefits. Whatever the approach to expanding PID adoption and integrations, it’s in everyone’s interests to ensure that smaller and/or less financially secure organizations can participate on an equal footing with their larger, wealthier counterparts. And it’s important to note that these benefits don’t just apply to institutions (although they’re typically the starting point) — they apply to all types of organizations in the research ecosystem, from the tiniest university press to the largest commercial publisher; from small private funders to federal agencies; and from community-led infrastructure organizations to the big proprietary service providers. Likewise, all types of organizations must be involved in integrating PIDs if we are to be successful at expanding adoption to the point at which everyone benefits. Third-party providers such as content platforms, manuscript submission systems, etc, are absolutely critical to this — they can help make it as easy as possible for that all-important metadata to be collected, connected, and shared, early and often.

So, why should you care about these fledgling PID strategies? Because if we all have a stake in implementing them, we will all collectively benefit from their success.

Alice Meadows

Alice Meadows

I am a Co-Founder of the MoreBrains Cooperative, a scholarly communications consultancy with a focus on open research and research infrastructure. I have many years experience of both scholarly publishing (including at Blackwell Publishing and Wiley) and research infrastructure (at ORCID and, most recently, NISO, where I was Director of Community Engagement). I’m actively involved in the information community, and served as SSP President in 2021-22. I was honored to receive the SSP Distinguished Service Award in 2018, the ALPSP Award for Contribution to Scholarly Publishing in 2016, and the ISMTE Recognition Award in 2013. I’m passionate about improving trust in scholarly communications, and about addressing inequities in our community (and beyond!). Note: The opinions expressed here are my own

Discussion

7 Thoughts on "Why PID Strategies Are Having A Moment — And Why You Should Care"

Where’s the argument for PIDs based on the user needs of researchers?

I’m concerned for researchers’ agency with regard to persistent identifiers: we know that lots of folks change their names and there are still too many systems in scholarly publishing that don’t sufficiently account for this. There’s a long road ahead to ensure that people are being fairly represented and not put at risk by these systems.

ORCID can be thought of as a name independent way of linking people to their outputs and affiliations. If you change your name, the link between the person (ORCID ID) and output (DOI, URL, PMID etc) remains. Furthermore, everything attached to an ORCID record is controlled by the researcher – including their name and name variants. They can be added, updated or removed by the researcher whenever they like.

However, there’s a number of things that we as a community (researchers, publishers, software vendors, infrastructure providers, institutions etc etc) need to work on. Name changes don’t propagate well, and if they do, propagation is inconsistent across platforms. We need best practices around how to deal with name changes, and we need to agree to apply them consistently. We need those affected by name changes to help us understand the actions that should be taken and where we need to take them. And we need software vendors and infrastructure to implement the best practices once they’re defined.

ORCID is one of these infrastructure providers and we’re happy to make the necessary changes once they’re agreed. We’re part of the NISO “Recommended Practice to Update Author Name Changes” working group which is attempting to address these issues, and I’m hopeful we’ll make some progress in this area. https://www.niso.org/press-releases/2021/04/niso-members-approve-proposal-new-recommended-practice-update-author-name

Perhaps I’m still wrapping my head around the concept, but doesn’t the record itself change (not the persist identifier directing to that record) when someone’s name changes? I ask this question non-critically: how would a persistent identifier be an obstacle to that person changing their name?

Simply having a publication history can be an obstacle to changing your name: this problem is very well documented, especially for women, trans folks, and those whose names don’t fit software systems’ expectations of names.

Incomplete name changes across different sources connected by PIDs can leak information (perhaps in aggregate) in ways that previously were much harder to see, e.g. you might be able to see that a person has transitioned, or infer that they got married or divorced.

We explored some of these issues in a workshop in 2021 in our UK ORCID community. There’s a full report https://ukorcidsupport.jisc.ac.uk/2022/01/managing-researcher-identity-and-name-changes-using-orcid-uk-orcid-jisc-consortium-event-report/ or a 2-minute summary https://ukorcidsupport.jisc.ac.uk/2022/01/managing-researcher-identity-and-name-changes-using-orcid-uk-orcid-jisc-consortium-event-report/#2minuteSummary, with links to further resources such as the main groups working on this issue (including NISO as mentioned by Tom)

As a metadata librarian who has been working for a decade on PID adoption in the discovery metadata supply chain, I wanted to mention the central role of libraries in PID dissemination. The role of libraries is unmentioned in the post. As part of libraries’ central mission to support discovery of information and creative works in all formats and languages, librarians spend countless hours on identifying and disambiguating unique authors and creators across the global ecosystem. Having “upstream” metadata from authors and creators themselves available to libraries early in the creation/publication process would be a game changer for metadata librarians to spend their time on enriching discovery metadata for under-discoverable works, especially those less widely held or originating from less resourced parts of the globe.

Thanks all for your comments. On the name change issue, as Tom says, the hope is that ORCID (as part of a community developed new NISO recommended practice) will be part of the solution rather than part of the problem.

To the broader point of making sure that researchers needs are taken into account in the development of PID strategies, most – if not all – the groups I know of that are working on this are involving researcher organizations such as scholarly societies in their discussions and decision-making to ensure that researchers are well represented. And most researchers I know are frustrated by administrative tasks which, if anything, are increasing rather than decreasing – so removing some of that burden through the improved use of PIDs would be welcomed.

Michelle, thank you for the reminder about the role of librarians – you’re absolutely right. Again, librarians are being involved in many (most?) of these discussions. I didn’t call them out specifically but as mentioned, getting the widest possible community involvement in the development of these strategies is absolutely critical to their success.

Comments are closed.