There is an increasing awareness of the need for unique identifiers for exchanging scholarly information. The use of trade identifiers for resources like books (ISBNs) and journals (ISSN) has long been a lynchpin in successfully managing the supply chain of published materials. As online systems for discovering and distributing content have grown, so too has the need for unambiguous identification of people and the parties exchanging that content. Several systems have been in development in the past couple of years, notably the Open Researcher and Contributor ID (ORCID) and the International Standard Name Identifier (ISNI) system. How these two systems relate, engage each other, and serve community needs isn’t always clear.
In hopes of alleviating some of the confusion, I sat down with Laura Dawson from Bowker to discuss the International Standard Name Identifier, how it relates to ORCID, and other issues surrounding identity management systems. This is a summary of that conversation.
Thank you, Laura for taking the time to speak with me. First, can you talk a little bit about what ISNI is and how you are involved in it?
Thank you for giving me the opportunity to talk about ISNI. The International Standard Name Identifier is an ISO-certified global standard for identifying the contributors to creative works and those active in their distribution. ISNIs can be assigned to researchers, writers, artists, visual arts creators, performers, producers, publishers, aggregators, and other notable individuals who have been written about, filmed, or recorded. It can also be used for institutions. ISO’s Technical subcommittee on identification and description published the ISNI standard in 2012. This is the same group responsible for the ISBN, ISSN, DOI, and related standards for the publishing, library, and information distribution communities.
Why is ISNI important and how can it help information distribution?
There are a couple of important use cases for ISNI. First and foremost, contributor name disambiguation clarifies royalty or licensing payments. Getting the correct payment to the correct person is extremely important to rights organizations, publishers, film studios, record labels, and the person receiving them.
The second use case for ISNI revolves around linking data. Organizations that use content from disparate providers can link that content if those providers are all using ISNIs in their data. For example, ProQuest’s Scholar Universe is ingesting ISNIs into its database. Bowker’s Books in Print is, as well. So, for any author who is present in both databases, a link between those two records can be created. A link from those records to the author’s Wikipedia page can also be created (because Wikipedia is also ingesting ISNIs). A link from those records to the theses stored by the British Library can be created (because those theses authors have also been assigned ISNIs). In a world of rapidly proliferating data, links among that data are critical for discovery. ISNI facilitates that.
Who is behind ISNI?
There is a consortium of players from a variety of industries contributing to the leadership of ISNI and administration of the ISNI International Agency (ISNI-IA). OCLC is host to the actual data and servers that comprise the ISNI system in conjunction with their role leading the Virtual International Authority File (VIAF), a network of national library authority files from 46 countries. Bowker/ProQuest plays leadership roles in the maintenance activities. Ringgold is a registration agency for institutions. JISC, Wikipedia, the British Library, the Bibliothèque national de France, CISAC (International Confederation of Societies of Authors and Composers), and IFRRO (International Federation of Reproduction Rights Organisations) are also involved in registering names to the system. Each is gathering data for ingest from a variety of publishing, research, performing arts, and visual arts communities.
How many ISNIs are registered right now?
The number is quite large. The ISNI system has assigned nearly 7.5 million identifiers. There are an additional 10-plus million that are provisionally assigned, but we are waiting for corroborating evidence to add them to the list of formally assigned IDs. The reason for this is that ISNI is particularly concerned with data quality and we want to be sure that when an ISNI is assigned, we have either computationally proven the link between the name, the metadata, and the assigned ID or that this data has been checked manually by someone.
How is ISNI funded?
The ISNI-IA is funded through generous donations of member organizations and representative Board of Director members. These fees depend on the size, organizational structure and the services that those members hope to receive from the ISNI system. Registration Agency (RA) fees for deposit and data cleanup are also important elements. Finally, like every organization, ISNI is looking for new members and sponsors of their work.
One of the business cases for ISNI is the use of the identifier to track royalty payments. CISAC is running some royalty systems based on ISNI. Macmillan has joined and is using ISNI in their Digital Science systems as well. Bowker has ingested 2 million ISNIs into Books in Print and will be adding more on an ongoing basis.
What is the relationship between ISNI & ORCID?
The two identifier systems were set up to serve different communities and use cases. ORCID is focused solely on the researcher community, while ISNI’s application is much, much broader.
In January of 2014, ISNI and ORCID signed a formal Memo of Understanding for a strategic partnership. There is now an ISNI2ORCID tool that allows ORCID registrants to import data from their ISNI profile to their ORCID profile. In addition to ISNI and ORCID’s commitment to interoperability, Board members from both organizations are writing a white paper exploring the feasibility of a shared identifier system. This paper is due to be presented to both Boards by June of 2014.
ORCID has decided to utilize the same numbering system, so that the two identifiers appear similar and if they were ever to come together, the IDs would match. In addition, the number set that ORCID is using has been reserved by ISNI—again, so that there won’t be overlap in the numbers, such that if one comes across an ORCID number and entered it into ISNI, it would not return some other person’s data.
There are other ways the two systems are working together. For example, ORCID is using ISNI to identify institutions, working with Ringgold, the ISNI registration authority for institutions..
What about this notion that ISNI and ORCID are in some way competitors in the same space?
First, there is no reason for the community to believe that. The two systems are complementary and have been working in tandem on some issues for a while. The metadata systems, even the underlying philosophy of the two systems is quite different. ORCID is a system of user-asserted authority, while ISNI is a system of curated authority based on secondary data sources. Now both are beginning to include elements of the other system. For example, ORCID is now accepting registration from universities and publishers. ISNI is also accepting individual registrations for those who would like it. Finally, the scope of the two systems is quite different. ORCID is focused on the scholarly community; while the remit of ISNI is much broader and includes authors, patent holders, composers, institutions, musicians, even fictional characters.
How does one go about getting an ISNI?
Because there are millions of ISNI’s already assigned, if someone has already written a book or created some cultural content, it’s likely that they already have one assigned or provisionally assigned. The criteria for ISNI assignment are quite different from ORCID. ISNIs are assigned either by an institution, such as a library or a publisher. ORCIDs are self-assigned when a user sets up a profile.
Data is ingested into the central ISNI system and names are matched against the existing corpus of names to the incoming file list. If confidence is low between a connection, it is then passed out to a team of librarians at the British Library and the Bibliothèque nationale de France (BnF). The biggest difference between the ORCID and ISNI systems is the authoritativeness and the quality control that goes into supporting the system.
How stable is ISNI?
As an international standard, the ISNI system is grounded in an international system of interoperable standards and enjoys global support. Participation and adoption obviously depends on the needs and usefulness of the system. We are beginning to see take up of ISNI in a variety of industries and registration has been picking up speed.
The market for individual assignment of ISNI will be fairly small. There is a fee associated with registration, mainly to cover data management fees, but we expect few will register on their own. Publishers, and more likely libraries, will include registration as part of their work on author authority file creation. In this way, ISNI’s creation and maintenance is already built into the world that libraries, particularly National Libraries, already have a mandate to do and will continue to invest in as part of their mission.
Can you talk a bit about the institutional ID system that is now part of ISNI?
The metadata around institutions is quite complex and very different from that of individuals. ISNI built upon the work of NISO and its Institutional identifier (I2) initiative several years ago. The ISNI data model was expanded to include the institutional metadata that was identified by I2. Ringgold is the Registration Authority for institutions and maintains the richer metadata around institutions, not just the IDs and core metadata.
So the data held by registration authorities is different than what ISNI maintains?
Yes, the data held by the ISNI central system and what the registration authority maintains are quite different. The model for ISNI is to create a system where each registration authority can support a business model based on the data that they curate, if they need to.
Privacy is a big concern for identifiers and metadata systems. How does ISNI protect identity?
There are a variety of ways that personal identity is protected. First, the central ISNI system holds a very small amount of metadata about the identifier. The core ISNI system only contains 25 data fields, and even these are not all shared publicly. The richer metadata is handled by the various registration authorities and is also held confidentially. Another element is that Date of Birth is not a required data element for ISNI assignment and if it is captured, it is not displayed publicly, only used for disambiguation purposes.
So how does ISNI keep this data private?
While ISNI does make available a public look-up service, the amount of data returned is minimal. It really isn’t meant for public reference, not at all like the profiles that are visible in, for example an ORCID profile. [NB – from TC: should one choose to make it available, that is possible; it is private by default.] Privacy settings controlling how much data can be gathered or displayed can be set by the individual or the registration agent. The community is relying on the agents to respect the privacy interests of the community, but most agents have chosen to make core metadata private by default. In addition, the ISNI policies have limits about what can be publicly displayed. Finally, if there are any complaints or conflicts, the data is withdrawn from public view.
Thank you for your time and sharing information about this initiative with the Scholarly Kitchen readership.
If anyone would like more information about the ISNI system, please visit the ISNI-International Authority website for more information.