I’m starting to see the end-game of the STM/NISO RA21: Resource Access for the 21st Century project. And, dear reader, I’m a little unsettled by it.
I mean, it sounds innocent enough – even laudable:
RA21’s mission is to align and simplify pathways to subscribed content across participating scientific platforms. RA21 will address the common problems users face when interacting with multiple and varied information protocols.
But, have we really thought through the implications of what is being pursued?
Most librarians are very familiar with how challenging it can be to get to the PDF from a citation. Roger Schonfeld has done great work raising these issues. When I watch this video of his attempt to gain access to an article, I sometimes have the Ride of the Valkyries running along as a soundtrack in my head (start at 6:45 for the specific demonstration).
As a PhD student, I experience my own frustrations with this regularly. Even though I am a librarian who knows how our systems are configured, I can find myself clicking through to dead-ends and frustrated with embargoes, content that is published but not yet processed for the platform via which my library subscribes to it, DOIs that aren’t yet registered, etc. It is particularly egregious when a platform, such as Elsevier’s ScienceDirect in this case, prevents users from using established pathways to library provided access. I find myself spending hours of time to retrieve a minimal number of sources.
Libraries put a great deal of effort into supporting and managing links, openURL resolvers, proxy servers, etc. I am responsible for managing the A-Z and subject database listings on our library website. I know how challenging and time-consuming even that aspect alone can be. So, of course librarians are interested in publisher efforts to “align and simplify pathways to subscribed content across participating scientific platforms.”
But, is the approach chosen by RA21 what we meant?
Publishers, libraries, and consumers have all come to the understanding that authorizing access to content based on IP address no longer works in today’s distributed world. The RA21 project hopes to resolve some of the fundamental issues that create barriers to moving to federated identity in place of IP address authentication.
Authorizing access based on IP address no longer works? Honestly, it seems to work rather well much of the time. Seamless. So seamless our users often don’t realize it is happening! Okay, I’ll grant that off-campus access is where the IP authentication system does often break down if users do not go through the library website and its proxied links. But, is “federated identity” across platforms the solution we want?
I have been encouraging librarians to take a more active interest RA21 for more than a year. At this point in the conversation many have said to me, “what does that even mean?” Great question!
Federated Identity (and Privacy)
Here’s my understanding. When fully realized, it means that by logging in once, you would be recognized on all participating platforms, which means you could leave a data trail of both who you are and what resources (content and tools) you are using. Yes, that means your data could be potentially aggregated across platforms and combined with other datasets to create a more complete profile of you as a user. It is likely that you are already leaving trails of use data connected to the IP addresses of the devices that you use. With federated identity, the trail is connected to you and to the devices. An analogy is how one can use a Google login to access not only your Gmail but also Dropbox, Asana, etc., and then Google is able to build a profile of you as a user by integrating the data from your activities across platforms and tools.
Such federated tracking is unlikely to be fully developed in the initial RA21 projects and the most pernicious form would require publishers to collaborate in data sharing in ways that they currently are not inclined to do. But, I think there is every reason to anticipate such technologies could be created in a fairly short period of time should those sentiments shift. (For those desiring a more detailed technical explanation of the RA21 projects, I recommend Aaron Tay’s Understanding Federated Identity, RA21 and Other Authentication Methods. The potential for an aggregated data trail is seen most easily in the WAYF Cloud project.)
Reading a little further into the RA21 website, there is a set of guiding principles for the initiative and it is good to see privacy mentioned:
The solution will be consistent with emerging privacy regulations, will avoid requiring researchers to create yet another ID, and will achieve an optimal balance between security and usability.
This is essentially a statement that says the federated identity solutions will follow the privacy laws that they are required to follow. Not too surprising and relatively good news for users in the Europe Union (EU) and less good news if you are in the United States, though it is likely that platforms would implement the stronger EU requirements uniformly for ease of management and scalability. Nonetheless, these regulations only go so far for user control because RA21 considers the institution, not the individual, to be the owner of the data. It’s not entirely clear to me that the EU regulations will support that interpretation.
I hope that the discussions may shift to the importance of building in mechanisms for user control. According to Todd Carpenter, fellow Scholarly Kitchen Chef and NISO Executive Director, RA21 is exploring adopting the NISO Consensus Principles on Users’ Digital Privacy in Library, Publisher, and Software-Provider Systems. This would be a welcome addition to the RA21 privacy framework.
A side note here: I acknowledge that the SAML approach embraced by RA21 is more privacy-protecting than, for example, adopting a Google or Facebook OpenID option. It is not, however, more privacy-protecting than IP authentication.
Eliminating IP Address Authentication
Many librarians who are concerned about user privacy have said to me that their solution will be to refuse to implement federated identity and insist on staying with IP address authentication. I’m not confident that is going to be an option.
First, campus technology units are likely to prefer federated identity solutions to IP authentication as identity-based solutions offer greater account and network security. These units do not seem to share the library’s commitment to user anonymity and minimal data sharing. In fact, more than one publisher/platform has privately confirmed to me that campus identity systems pass along more user information than they need or would like to receive. I recently watched as a campus technology SAML/Shibboleth system passed a user’s email address, full name, and staff/staff status to a vendor in order to allow access to a PDF from off-campus when on-campus access would have been possible based on IP address alone.
Second, publishers and platforms will likely prefer identity-based authentication mechanisms. Again, identity-based systems offer greater account and network security than IP authentication, which is widely seen as a weakness that Sci-Hub exploits in pirating content. Also, users are already regularly tracked by vendors via tools like Google Analytics, regardless of the authentication process. Given the voracious appetite for analytics and metrics in higher education and publishing, combined with concerns about security and license compliance, platforms have every reason to want move not only off-campus access to federated identity but on-campus access as well. I anticipate that publishers will eventually begin to craft licensing agreements that require identity-based authentication, making explicit that they no longer offer IP authentication.
Ultimately, is the long-term goal of RA21 to eliminate IP authentication altogether? When asked this question directly, the response is consistently about how long it will take and not a denial of the long-term goal. It should probably not surprise us if an initiative that is working to “resolve some of the fundamental issues that create barriers to moving to federated identity in place of IP address authentication” is aligned with a long-term goal to implement federated identity globally.
Impact on Libraries and Publishers
How would the elimination of IP authentication change the marketplace? Smaller publishers may be unable to support identity-based login themselves, potentially driving them to contract their content to a larger platform or to purchase authentication services in some way. This will raise expenses for smaller publishers without returned value and likely drive greater consolidation of scholarly publishing platforms.
Libraries will find themselves no longer able to offer the seamless user experience that IP address authentication provides much of the time. Given the importance of user-centered discovery and delivery, increasing friction in access to resources will be a disappointing step backwards. Libraries will be forced to devote increasing amounts of staff time to training and troubleshooting identity-based accounts and this will be particularly acute if on-campus IP authentication is eliminated.
Strategies and Next Steps for Libraries and Publishers
One might wonder if we are past the tipping point with RA21 and the march toward federated identity through commercial platforms. Is it too late to create a user-centric alternative managed by a trusted third party and developed by libraries? I’d like to think not but I’m worried it might be. Perhaps there are alternative efforts underway of which I’m just not aware?
Nonetheless, it would be a bit unsatisfying to close this blog post without some thoughts about next steps. So let me suggest some strategies that every library can implement locally and immediately:
- Reach out to the campus technology unit that manages identity-based authentication systems (e.g., InCommon or OpenAthens) and engage in an ongoing discussion about privacy, user control, minimal sharing of identifiable data, etc., with the goal of developing local principles to guide data release.
- Watch carefully for licensing terms that dictate user data sharing requirements for access to content and be prepared with responses. If IP authentication is no longer an option, seek to minimize the user data that is demanded in exchange for user access.
- Review library privacy policies to make certain that the library is transparent about what data is being passed to third-party systems and what alternatives users have if they want to try to opt-out of data sharing and tracking.
- Regularly use library resources without using IP address authentication to monitor the user experience of identity-based authentication and the messaging from platforms to users. Some librarians who have told me they will refuse to implement federated identity actually work at institutions that have already implemented SAML-based InCommon or OpenAthens for access from vendor sites. In such cases, the librarians had not realized this because they themselves only access library resources on-campus, over VPN, or through the proxy server.
As for smaller publishers that may not be following these issues as carefully as some of the industry heavyweights that are staffing the RA21 workgroups and setting policy, they should:
- Investigate how authentication affects use of their content and platform at a tactical level and impacts their business model at a strategic level.
- Recognize that authentication will affect not only discovery and access to content resources but also other parts of the research workflow, where just a few companies are consolidating major new businesses.
- Monitor the overall landscape of available and in-development options from technology platform providers both in the regular course of business and through strategic inquiries during any upcoming RFP processes.
As for me, I’ll be honest. I believe that IP authentication is going away — maybe not immediately but relatively soon. Pragmatically, I’m focusing my efforts on influencing what comes next. I’ve joined the RA21 project and am serving on on the privacy and security work group with a view to advocating for user control. To give credit where credit is due, I have found the members of the work group to be very welcoming of my perspective and participation.