On January 23-24, 150+ PID (persistent identifier) people from 23 countries across six continents gathered in Girona, Spain for the second PIDapalooza festival of persistent identifiers. Co-hosted by California Digital Library (CDL), Crossref, DataCite, and ORCID (my organization), PIDapalooza 2018 featured updates and discussions, demos and use cases, brainstorming and networking. As well as the 40+ interactive half-hour parallel sessions, there were plenaries by Geoffrey Bilder (Crossref), Jo McEntyre (EMBL/Europe PubMed Central), Melissa Haendel (OHSU Library), and Carly Strasser (Coko Foundation).
I’m not going to attempt to cover the whole meeting, other than to share Carly’s summary of the main themes of the conference, which she identified as: metrics, connecting PIDs, resolvers, identifying organizations, outreach, identifiers for [other stuff!], data, persistence, and simplifying and focusing. Unsurprisingly (since it’s my job), the one which sparked the most interest for me — and which also cropped up a lot at the first PIDapalooza — is the critical importance of effective outreach to researchers. I wrote then that, “the real challenge for us all is to get [PIDs] used more widely, consistently, and appropriately. What Phill [Jones] calls below the difficult “social” questions. And that means understanding – and effectively communicating – the value of PIDs to researcher organizations and researchers alike, in order to ensure their wide adoption and usage across the whole research community.”
So Simon Porter’s presentation on Research Information Citizenship really hit home for me. It’s a topic he’s been evangelizing for a while now, and which I think deserves a lot more attention than it’s had so far. His notion is that, for the research infrastructure to work, we must all be good ‘citizens’, playing our part by providing and/or using PIDs (and other elements of the infrastructure, such as standards). As he notes: “Persistent identifiers are not just technical, they are social. Yet for the most part, expectations of how we should behave, and what we should do to our data for the benefit of other research citizens remains implicit at best.” Or, as my colleague Josh Brown puts it, “it makes a lot of sense to me to think of the ‘thing we are citizens of’ as a commons. We manage the commons not for profit or personal gain, but for the sake of the community. I think identifiers are a really good example of a shared resource, since they really exist to make connections — they are signposts so they naturally sit outside of any individual silo. We, as citizens, sustain our common PID resources for the common good, for the well-being of research.”
But what has research information citizenship got to do with outreach, you may ask. To develop and maintain a robust and trustworthy research infrastructure, we need to increase PID adoption by researchers. And, to get them to buy into using and sharing PIDs, we need a powerful message, one that appeals to researchers across all communities and career stages. I think research information citizenship is that message. It’s a shared vision of what can be achieved in a world where we are all good citizens, one where persistent identifiers and other elements of the research infrastructure are being used by everyone, in ways that will benefit everyone.
At a more tactical level, my former colleague, Laura Wilkinson, led a session on Anticipation, Action, Awareness: A PID Communications Template for All, based on the ORCID outreach campaign template we launched in October for organizations to use and adapt. She asked participants to brainstorm campaign ideas based on key questions: Who (is your audience?); What (are you going to tell them?); Why (do they need to know?); When (will your campaign happen?); Where (will your communications happen?); and How (will you know if you have succeeded in getting the message across?). The resulting suggestions are available on Figshare — we hope to try out at least some of them ourselves during the course of this year.
Outreach also came up in the session on Metadata 2020, which is itself a great example of research information citizenship in practice, a community-led “collaboration that advocates richer, connected, and reusable open metadata for all research outputs, which will advance scholarly pursuits for the benefit of society.” However, as Ginny Hendricks of Crossref noted, “metadata needs its own PR campaign. Most researchers either don’t know what it is or think it’s something very technical.” Cue another outreach opportunity! Metadata 2020 is about to launch a researcher outreach working group to develop a set of community tools and resources that will, among other things, help address that knowledge gap, and educate researchers on how to create good metadata, why they should care about it, how it will benefit them, and more.
Angelina Kraft (TIB) presented some interesting feedback on PID outreach efforts to date in Germany. The results of a 2016 survey of 1,400 scientists in the natural sciences and engineering show that, although over 70% use DOIs for journal publications, less than 10% use them for research data. Why? More than half (56%) said it was because they didn’t know they could! In addition, 40% of respondents said they need more information about PIDs, and almost 30% see no benefit in using PIDs — presumably at least partly because they don’t know what they are and/or understand how they can help individuals, organizations, and the wider community.
The next PIDapalooza is provisionally planned for January 2019. Wouldn’t it be great if, by then, we’ve all made some progress towards becoming good research information citizens ourselves, whether as publishers, librarians, funders, associations, service providers, or, most importantly, researchers?
In the meantime, I’ll leave you with Angelina Kraft’s proposed PID 101 of nine essential facts that all researchers should know about persistent identifiers!
- A PID is a “long-lasting reference to a digital resource”
- There are different sorts of PIDs and different uses of them
- PIDs are provided by organizations
- You (the researcher) don’t have to pay for PIDs
- PIDs are mostly used for (persistent) citation
- A correct citation always includes a PID
- The metadata behind a PID are very important
- PIDs are not perfect!*
- PIDs are really useful and fun!*
*my favorites!
Discussion
6 Thoughts on "Getting the PID Word Out — At PIDapalooza and Beyond…"
So as a managing editor who is fairly new to publishing, I see the benefits of PIDs and metadata. For example the benefit of ORCID is obvious, particularly for researchers with common names (or common English translations of their names).
But I also have many questions. The most important PID for journal articles is the DOI. Yet the indexing services don’t seem to be taking those into consideration. If the date the DOI is issued were to become the date of record for a published item, it would have a significant and I would argue highly positive impact on author and journal metrics.
Second, as recent articles discuss the use of identifiers for authentication for journal access, similar questions of privacy versus ease of use arise for PIDs. What is being done to coordinate PIDs among different platforms, perhaps to arrive at a singular PID (ORCID and Publons, for example). What can we do to engender trust among the researchers in the era of Facebook and Google as PIDs? In that realm I use individual logins rather than trust Facebook or Google to protect my information and not use it for advertising purposes. I do pay a price of having to reset logins for most sites on almost every visit, but it’s a good trade-off to protect my identity.
I look forward to an interesting conversation on this topic.
Hi Jonathon, thanks for your comments. I actually thought that Google Scholar, at least, did index DOIs – but hopefully someone can confirm or deny that! In terms of having a singular PID, what we really need instead is reliable links between different PIDs – and I think we are making progress on that front, though there’s still plenty of work to be done…
It’s odd — google _could_ easily index the DOI — there’s the citation_doi metatag that’s almost universally present in the highwire metatags, for example — but it doesn’t seem to, for reasons I don’t understand. For example, the DOI isn’t included in the bibtex they offer for download or in the article citations, even in styles like APA that require it.
You can find articles when searching by DOI, of course — it’s a full text search, after all — but if you know the DOI you might just as well resolve to it directly.
Hi Jonathon, I don’t promise to answer all these questions but here are a few thoughts in response: Many search and discovery services use DOIs (whether they’re Crossref ones–where I work–or DataCite, ISTIC, JaLC, etc. ones) including ProQuest, Scopus, Dimensions, Microsoft Academic, and pretty much thousands of others. A single PID wouldn’t work because of the nature of the different things/objects/entities that need identifying; people are not the same as merge-able/sale-able companies for example. But PIDs can and do interoperate e.g. the Crossref “relations” schema allows our members to put any PID in the metadata as a related work or object, for example a PMID in the `is-version-of` field. More to do though for sure.
From a publishing perspective we can see the use of PIDs. And we use them.
But from the researcher perspective I suspect all of these are considered unnecessary paperwork. In order to get wide adoption, there has to be something in it for them. A reduction in other paperwork, perhaps. Something else?
As a for instance, a lot of us use the CrossRef Funding Registry/data. We require authors to use interfaces and choose each funder for their paper. We did a small study at our journal and discovered that more than half of our authors weren’t even bothering to try and find the right funders in the database. It’s too much work for them and they get nothing out of it. They simply typed whatever they wanted into the box. If they were funded by the NIH they still have to deposit the papers to NIHMS (yes we do it for them) and they still have to list all the papers when they send in their grant renewal. So why should they care if they provide the correct data to the publisher?
I’ve stated the problem but I don’t know the solution. Outreach isn’t going to be enough. Community norms take a very long time to change. If we want adoption more quickly than organically, there need to be some carrots involved.
Thanks Pam, you’re right of course that it’s not “just” about outreach! We need to find a way to demonstrate the value to researchers of using PIDs, which we think is about getting to “Enter once, reuse often”, where PIDs enable verified information about researchers to flow between the systems they use without them having to rekey it every time. Collecting the evidence that this is starting to happen – and does indeed save time/make researchers’ lives easier – is a goal for ORCID this year (see project 2 on https://orcid.org/about/what-is-orcid/mission/2018-project-roadmap). As mentioned, the Metadata 2020 researcher outreach group will also be working on identifying the benefits of good metadata for researchers this year. So hopefully we will make some progress – and offers of help/further suggestions/analyses of the impact of PIDs for researchers are very welcome!