It feels increasingly like the adoption of persistent identifiers (PIDs) and, critically, the metadata associated with them, are at a tipping point. This is a long-awaited moment for PID enthusiasts like me, but it begs a number of questions: Why are PIDs important? What’s causing the increase in interest? And what does it mean for the wider research sector — including publishers — both nationally and internationally?
My MoreBrains colleague, Phill Jones, addressed some of the reasons why in his recent post, “Unnecessary Research Bureaucracy is Killing Academic Productivity, But it IS Fixable“, in which he highlighted the massive amount of time that researchers waste doing admin tasks, which they could otherwise spend doing actual research to help solve our very real world problems. Not to mention, their frustration at that waste of time and, all too often, the mental health consequences of that frustration. As he notes, PIDs could be part of the solution:
“Integrations of funder, institutional, and publisher systems into PID authentication systems like ORCID single sign-on and metadata registries won’t remove all administrative burden, but they will eliminate much of it, particularly with assurance and accreditation requirements.”
Benjamin Franklin famously noted, “Time is money.” So it’s not surprising that funders, governments, and others have started paying attention to recent discussions about how improved research infrastructure could save researchers time. Until fairly recently, however, evidence of this (unsustainable) bureaucratic burden has been mostly anecdotal. So it’s been very gratifying to work on a couple of cost benefit analyses that bring more rigor to the study of this problem. MoreBrains’ recently updated an analysis we performed for Jisc, which estimates that researchers and administrators in the UK waste around 55,000 person days a year (equivalent to £19 million/US$23 million) rekeying information into university systems. A similar piece of work for the Australian Research Data Commons (ARDC) and the Australian Access Federation (AAF) in 2022, found similar levels of waste (38,000 person days, equivalent to AU$24 million/US$16 million). Adopting PIDs and integrating them into the systems used throughout the research process (grant applications, manuscript submissions, repositories, etc.) would allow those wasted hours to be redirected to actual research.
There’s no denying that money talks, and being able to demonstrate the actual financial value of widespread PID adoption and use is definitely one way to get the immediate attention of policy-makers and other decision-makers. During a recent webinar about the findings of the Australian report, Natasha Simons of ARDC noted that the response from Australian research leaders, funders, and policy-makers was “Wow, those figures are really impressive!”
However, for many funders and policy-makers, it’s equally — if not more — important to ensure that the research they support, whether directly or indirectly, is open and FAIR (Findable, Accessible, Interoperable, and Reusable). There’s increasing recognition among these groups that wider adoption of PIDs has the potential to play a critical role in increasing openness and transparency. And open PIDs, in particular (typically those that are community-governed, and have openly available metadata), also support FAIRness, as my former ORCID colleague, Tom Demeranville has pointed out:
“FAIR PIDs … are not just resolvable, but can also be used to discover open, interoperable, well-defined metadata containing provenance information in a predictable manner. They are openly governed for the benefit of the community … and the attached metadata is available under a CC0 license, meaning that it is open to everyone. The metadata contains information about the publisher, the publication, other authors, funding, and affiliation(s), all of which help establish the provenance of the item.”
Last year’s White House Office of Science and Technology Policy (OSTP) Nelson Memo is just one recent example of a national funding organization that is paying attention to PIDs. It directs US agencies to instruct their funded researchers “to obtain a digital persistent identifier … include it in published research outputs when available, and provide federal agencies with the metadata associated with all published research outputs they produce”. Other examples include UK Research and Innovation’s (UKRI) recently updated open access policy, which states that “Persistent Identifiers (PIDs) for articles must be implemented according to international recognised standards”; and Plan S’s requirement for the “Use of persistent identifiers (PIDs) for scholarly publications (with versioning, for example, in case of revisions), such as DOI”, which has been adopted by multiple countries.
It’s not just the national funders who are getting in on the act; there’s also been a surge in interest at the national government level. A number of countries in the Americas, Asia Pacific, and Europe are at various stages of developing and implementing national PID strategies. They include Australia, Brazil, Canada, Finland, the Netherlands, Peru, South Korea, and the UK, all of which are participating in a Research Data Alliance (RDA) National PID Strategies Working Group, set up following a Birds of a Feather session at the RDA Virtual Plenary 17 last year. There are a number of similarities between these countries’ approaches, as the RDA WG has found. Its aim is “to map common activities across national agencies/efforts and produce a guide on the specific PIDs adopted in the context of national or regional PID strategies [in order to] help others, irrespective of geographical region, follow a blueprint to define their national PID approach. The intention is that it can be adopted or adapted by other countries looking to develop their own PID strategies. By following the recommendations it will encourage standardisation internationally.” One element of this work is to identify the most commonly used PIDs across all countries, which I’m sure is music to the ears of my former NISO colleague Todd Carpenter, who pointed out in his recent post that, “It is past time that we all agree on a core set of identifiers and basic metadata elements and begin to encourage researchers to use them at scale when communicating their results.” Common PIDs (not all of which are open) that have already been identified in the RDA WG’s work include: ORCID or ISNI for researchers; ROR or ISNI for research organizations; Crossref DOIs for research articles; DataCite DOIs or Handles for research data; Crossref DOIs for grants; RAiD for projects; and DOIs, IGSN and RRID for samples and specimens.
While there are many similarities in approach and intention between the countries participating in the RDA group, there are also many differences between each country. The research endeavor is global, but research policy and funding are largely executed on a national level — and, of course, other local needs must also be addressed in any national PID strategy. In Canada, for example, this includes providing equitable support for both English- and French-language researchers. In both Canada and Australia, there’s increasing recognition of the value of Indigenous knowledge and a corresponding commitment to providing Indigenous researchers with an infrastructure that addresses their (often very different) requirements. Australia and the UK both have national research evaluation exercises, which could be transformed through an expanded use of PIDs including, for example, enabling the easy inclusion of practice-based research outputs.
A huge, but often overlooked, advantage of taking a national approach is the opportunity it provides to bring the whole community together, involving stakeholders from right across the research ecosystem — publishers and vendors, as well as funders, institutions, and of course researchers themselves. In fact, it’s absolutely critical to ensure that all groups are engaged in the discussions, decision-making, and rollout because, as noted in the Australian report:
“for the greatest benefit, a high proportion of universities and other institutions need to invest in integrations and adopt PIDs as part of their workflows—PID adoption is a collective action problem requiring both community organisation and collective investment” (my highlight)
In other words, no one benefits until everyone benefits. Whatever the approach to expanding PID adoption and integrations, it’s in everyone’s interests to ensure that smaller and/or less financially secure organizations can participate on an equal footing with their larger, wealthier counterparts. And it’s important to note that these benefits don’t just apply to institutions (although they’re typically the starting point) — they apply to all types of organizations in the research ecosystem, from the tiniest university press to the largest commercial publisher; from small private funders to federal agencies; and from community-led infrastructure organizations to the big proprietary service providers. Likewise, all types of organizations must be involved in integrating PIDs if we are to be successful at expanding adoption to the point at which everyone benefits. Third-party providers such as content platforms, manuscript submission systems, etc, are absolutely critical to this — they can help make it as easy as possible for that all-important metadata to be collected, connected, and shared, early and often.
So, why should you care about these fledgling PID strategies? Because if we all have a stake in implementing them, we will all collectively benefit from their success.