In late January, around 150 people gathered in Dublin for PIDapalooza, “the open festival for persistent identifiers” (aka PIDs). Fun, eh? Actually, yes!
This was the third such event and, although you might think it sounds rather esoteric — boring, even — most of the attendees would beg to differ. Admittedly, we are all self-confessed PID nerds, which certainly helps! But the organizers — California Digital Library, Crossref, DataCite, and my own organization, ORCID — also go to great lengths to make PIDapalooza entertaining as well as educational. And in Dublin, as in Girona and Reykjavic before, there was plenty of the PID craic to be had.
There’s always a lot going on at PIDapalooza, because the format is geared mostly toward short, fast-paced sessions, with three typically going on at any one time. So I can’t pretend to cover the whole event here. But luckily, first-timer Suze Kundu, who’s recently joined Digital Science, gave a brilliant Friends-themed wrap-up talk, which I have cribbed liberally from. Here are some of the themes that one or both of us identified from the sessions we attended — by no means a comprehensive account, so I strongly encourage you to also check out the presentations on PIDapalooza 2019 repository.*
Technological Change Is (Relatively) Easy; Cultural Change Is Hard
A lot of the technology we need to implement a PID-powered research infrastructure is already in place, or at least planned. But we’re still a long way from getting all researchers and research organizations to adopt and use them — and that’s essential if we are to build a trusted infrastructure. How can we communicate the value of persistent identifiers, especially to researchers? Once again, this year’s PIDapalooza featured several sessions that attempted to address this challenge. At a practical level, we were introduced to the newly launched PID Forum, which hopes to enable easy sharing of PID-related information and discussions between PID providers, users, and the wider community. There was also a joint Crossref/DataCite/ORCID session that focused on real-life user stories about good and bad experiences of using PIDs, which we hope can be shared to help encourage community adoption (perhaps via the aforementioned PID Forum!). Simon Porter built on his research information citizenship work with the community at previous PIDapaloozas and other meetings, to lead a brainstorming session to develop a statement about what we are now calling research infrastructure citizenship. It’s a work in progress, so look out for more on that…
Organization Identifiers Are Set To Be A ROR-ing Success
There was an enthusiastic welcome for the newly launched Research Organization Registry Community (ROR) at PIDapalooza — not surprisingly, since this community-led project aims to “develop an open, sustainable, usable, and unique identifier for every research organization in the world”. As well as an inaugural pre-conference community meeting, there was a very well-attended — and loud (lots of ROR-ing!) — session during PIDapalooza itself, which highlighted some of the challenges for organization identifiers, especially in research affiliations. This is critical because, without reliable affiliation data, researchers — and their organizations — won’t be able to get the recognition and credit they deserve.
There’s A PID For That
One of the fun things about PIDapalooza is the sheer diversity of the persistent identifier community. This time around, we learned about PIDs for everything — from neutron science to movies. In his (genuinely!) fascinating keynote about the use of PIDs at the European Spallation Source, Gareth Murphy noted that they’ll need at least 20m PIDs to start with, in order to handle the data they collect. Simon Porter popped up again, this time to demonstrate the need for PIDs for prizes and awards. This would enable them to be cited so that they could be used to highlight contributions at the individual, institutional, and community levels. And what about using PIDs for cultural heritage purposes? Alina Saenko reported on a project to implement PIDs for objects in the collections of several Flemish museums, galleries, and libraries, sparking all sorts of questions about how these could be shared as artworks are acquired, lent out, and sold. From there, it was a short hop, skip, and a jump to PIDs for movies, with Raymond Drewry of MovieLabs, who noted that introducing PIDs for films resulted in one major studio saving 60 person hours per month, and reducing errors by a staggering 90%. Needless to say, similar evidence of the value of PIDs for researchers would be invaluable — please do share if you have any!
What Makes A Good PID?
There was also quite a bit of discussion about what makes a good PID — something that we’ve also been thinking about at ORCID — including a lively discussion of the pros and cons of using HTTP(S) URIs as PIDs (viewed as a no-no by many attendees, despite keynote Henry Thompson’s efforts) and an overview of the merits of compact URIs by Sarala Wimaralatne of EMBL (the jury is also out on these!). Brian Kirkegaard Lunn of Digital Science invited us to consider How PID are DOIs? Answer, very — however, he did find about 13,000 non-existent DOIs (out of a sample of 1.5 million), most of which are publisher “owned”, so still some work to do there.
As always, it was a fun and informative couple of days. But more than that, it was a great reminder not just of what PIDs are but of why they’re important. To quote one of the attendees, Herbert van de Sompel: “PIDs are not the goal. Long term accessibility of the scholarly record is the goal.”
*I’ve added links to all slides available at the time of writing