In the past year of the pandemic, researchers increasingly required remote access to their academic institutions, ranging from library resources to the data gathered by their research group or in the lab. These changes in access patterns – some of which are looking like they will be here to stay – call for privacy-preserving solutions that help researchers with off-campus access securely and easily.

Last year SeamlessAccess™, a joint initiative run by GÉANT, Internet2, NISO and STM, went into beta-mode. In light of the pandemic, that turned out to be very timely – as testified by implementers of the service seeing increases of 150% to 300% for this type of off-campus use. SeamlessAccess is based on federated identity management (FIM) and uses SAML as the underlying technology (Security Assertion Mark-up Language, an open standard designed for secure single sign-on). It offers a modern alternative to long-standing but less flexible and somewhat outmoded IP-based access solutions through a privacy-protecting, secure single sign-on service. Previous posts in The Scholarly Kitchen already gave an inside view on the benefits of federated access, shared data on huge growth in federated authentication at the start of the pandemic, and shone a light on the strategic benefits of identity management and federated authentication for scholarly publishers.

Recently, questions have been posed whether FIM and SAML are, in fact, as secure and privacy-safe as often claimed. In response, the project team behind SeamlessAccess explains why the answer is simply “Yes”.

CyberSecurity Icon
Image CC-BY Noun Project Inc. https://thenounproject.com/term/website-protection/1857850

Federated Identity Management: Keeping your personal data close

Federated identity management provides a privacy-preserving method to confirm a user’s right to access third-party resources without keeping all data in a central database. In the long chain of scholarly communication (from individuals in campus communities, to the local library, to the research groups, to aggregating or sharing platforms, to publisher platforms, and so forth), this can be highly useful. It means that privacy-sensitive data is kept close to the individual concerned and usually by the institute where that individual resides (e.g., their university). In this chain, access entitlements are being checked locally. While this is a good thing from the standpoint of user privacy, it does bring an implementation challenge to ensure that users still have a seamless and easy access experience – which is precisely where initiatives like SeamlessAccess can make the difference (more on that below).

In such a chained system, security depends on the practices and behavior of every entity and service that is part of the chain. For federated identity, these entities are highly varied and sometimes disparate (see this earlier post for a quick primer on federated authentication or the AARC blueprint for full detail). They include the User, the Identity Provider (IdP), the Service Provider (SP) and the governing entity, normally a federation. The user is the student, researcher, faculty staff that requires authentication to access a service or content. The organization that holds the user’s account information is the Identity Provider (IdP), which may be, for example, the library or the university. The Service Provider (SP) is the entity that wants to leverage the IdP’s data about the user to determine which services to provide is the Service Provider (SP), such as a content platform. Finally, there are the governing entities that set the technical and policy ground rules for how metadata about organizations is shared, for example the home institute like the university or their federation, such as InCommon or eduGAIN.

Much of the governance, standards, and processes in an academic identity federation ecosystem exist to support secure behavior among all the organizations involved. In short: these organizations know how to work together and actively formulate and promote best practices in areas including user privacy and security.

When Resource Access in the 21st Century (RA21), the predecessor to SeamlessAccess launched, a dedicated Security & Privacy Working Group determined that using SAML-based federated access fulfilled almost all of the requirements around enabling secure, privacy-preserving remote access to scholarly content.

How to Streamline the User Experience

Interestingly, the biggest challenge with federated access is not in security and privacy, but instead involves the user experience with such a SAML-based authentication flow. Initial tests showed that this experience was hard for users to understand. Keeping privacy and security in mind, the focus of the work subsequently has become how to streamline the user experience at the first step of federated authentication.

Over the course of the project, several privacy and security experts evaluated the proposals for different ways to solve this. Their final report is still available. Out of the initial options, the Privacy Preserving Persistent WAYF (P3W; WAYF stands for “Where Are You From”) was selected and further developed into SeamlessAccess. Key in the solution is the IdP discovery mechanism; this is the way the user finds their IdP in order to start the authentication and authorization workflow. This is what the experts concluded about the P3W platform:

“There are no significant risks which prevent the WAYF Cloud and P3W pilots from moving forward. Residual risks from both a security and privacy perspective are LOW.  The nature of the data involved is low value, i.e., not directly or easily attributable to any natural person, and appropriate safeguards are in place to mitigate confidentiality concerns.” — RA21 Security & Privacy Working Group Recommendations: WAYF Cloud and P3W Security & Privacy Recommendations

While SeamlessAccess in itself does not use the SAML protocol, it supports and strengthens federated authentication by improving IdP discovery in two very specific ways. First, by offering a powerful yet simple search service to help a user find their IdP or home institute (It is worth noting that this search service is optional; some integrators may choose to use their own service while still benefiting from the SeamlessAccess UX guidelines). Second, users only have to do this once: by saving that choice in the user’s browser, other sites can display a button on their page that already includes the IdP choice; the user does not need to search again. At no point is any information about the user stored anywhere by SeamlessAccess, not even in its logs.

SAML Attacks and Hacks: is it really safe?

Even though privacy and security were inherently designed into SeamlessAccess from its very first conception, we regularly get asked whether the federated ecosystem itself is secure enough for widespread adoption and use. For example, earlier this year a hacking attempt was successful, despite the SAML system in place. This so-called SolarWinds attack that began in December 2020 raised alarms with many that are even remotely familiar with the acronym “SAML”. If hackers were able to break into tens of thousands of systems, how secure can SAML-based academic federations be?

To answer this question, the first thing to realize is that the SolarWinds attack was a highly sophisticated attack that combined what’s known as a supply-chain attack and then a SAML-focused attack. From the perspective of SAML, the real vulnerability was the part of the attack that gained access to the organization’s private key.

Gaining access to such a key and using it to falsely sign assertions is an attack model known as a Golden SAML attack, and that attack model has been recognized since 2017. The private key is one of the single most important pieces in any identity infrastructure and is considered best security practice to treat it as such.

As a pattern, attacks on keys and passwords are not unique to SAML; the same can be said for the root password for the computers that support Internet-based services and access control for users trying to access specific resources. This even translates to the physical world where physical locks have master keys and key cutting machines whose protection is vital to maintain their security. Similarly, if a root password is compromised, all services that run off that system are at risk. At the end of the day, access to any resource that is password protected critically depends on having the user choose and manage their own passwords carefully.

Federations Promote Key Security

Given the important role of keys and root passwords, best practices and awareness are essential to – quite literally – protect the keys to the castle. This guidance on best practice is where academic federations can add a lot of value: Participants in academic federations are usually acutely aware of the need for key security because they already engage in a one-to-many or many-to-many federation in which a third party, the federation operator, manages participant’s public keys (the other half of the public/private key pairs). These federations also require and propel solid security best practices (see for example the REFEDS guidance on Baseline Expectations and the InCommon federation’s Baseline Expectations).

Finally, we should not forget that passwords and keys are not the only security concern at play, and that traditional IP-based access solutions continue to have their own well-known and documented security issues such as IP hijacking (aka, BGP hijacking) and IP spoofing, making many network security specialists and Internet standards authors advise against the use of IP addresses for authorization purposes. For example, IP hijacking offers a way to route traffic through an unintended third party where attackers can log all access-related traffic to a site, making it a direct threat to both security and user privacy (see the now infamous Sea Turtle attack as an example).

Federated identity is not immune to security vulnerabilities, but then, no Internet-based service is. Services built on a solid set of security principles are in a good position to claim to be secure. SeamlessAccess itself is designed from the ground up around security, user privacy, and user experience. Our role in the ecosystem allows us to promote best security practices with the goal of reducing the overall risk to the ecosystem. Are there still ways the security of the ecosystem can be compromised? Absolutely, like there is for any digital ecosystem. Is the risk higher when it comes to SeamlessAccess? Not at all.

Todd A Carpenter

Todd A Carpenter

Todd Carpenter is Executive Director of the National Information Standards Organization (NISO). He additionally serves in a number of leadership roles of a variety of organizations, including as Chair of the ISO Technical Subcommittee on Identification & Description (ISO TC46/SC9), founding partner of the Coalition for Seamless Access, Past President of FORCE11, Treasurer of the Book Industry Study Group (BISG), and a Director of the Foundation of the Baltimore County Public Library. He also previously served as Treasurer of SSP.

Discussion

6 Thoughts on "Security, Safety, SeamlessAccess"

Thank you for this useful post. As I understand it, the user’s institutional preference (IdP) is stored in their Browser’s local memory as a key/value pair. Do third party applications have unrestricted access to this setting, or is it controlled in some way? I’m not claiming this is a security issue, I just want to understand how the solution works. Thank you. Richard.

You are correct: The user’s institutional preference is stored in the user’s browser local storage as an identifier that directs the button to the identity provider’s login page. More importantly, no, third party applications do not have unrestricted access to this information. Service Providers who wish to use the SeamlessAccess service must register to be authorized to access the storage and be able to present the SeamlessAccess button and access the local storage. Only those SeamlessAccess vetted and approved participating service providers can read or writ to the local storage. This exchange is done through a secure Application Programming Interface (API.) If you are curious about how this works, we invite you to take a look at our technical documentation in the Getting Started section of the SeamlessAccess site at: https://seamlessaccess.org/work

If I understand correctly, you argue that any security issue that looks related to SAML is not inherent to SAML. But it isn’t clear from this post how SeamlessAccess would be more secure or privacy-protecting than IP-based authentication. Could you explain?

Actually, that’s not quite correct. There are a variety of security challenges securing any digital system. In the particular case of the SolarWinds attack, the vector was not SAML per se, but the entire authentication system itself and the fundamental administration of that system. If you control the administration, then any layer above that is compromised.

Fundamentally, IP authentication is neither secure as a method of authentication, nor is privacy protecting. IP addresses are easily spoofed. If a nefarious actor can acquire, by any means, an IP address in the right range, then the institution is going to provide them access, which sets up a huge cascade of problems. A number of institutions use VPNs to maintain remote access to IP authentication services, but if someone has compromised a user’s credentials to access the network, it’s not likely the first thing they will want is access to the library’s subscribed content, it’s more likely the HR department files, or the procurement system data. By using any form of authentication (non network-based) layer, it makes it harder for the a would-be attacker to gain access when they are not authorized.

Every authentication method is breachable, as no security is perfect, including SAML. However, additional security layers can be added (such as two-factor authentication) that can make the system more robust.

Further more on the privacy front, IP addresses are defined by GDPR as personally identifiable information (PII) and as such can be used to track people’s behavior, creating regulatory concerns for service providers. There are many more examples (as noted above, but there are even more for both security and privacy) of the benefits of federated identity over IP-authentication.

Finally, SeamlessAccess itself does not store any information. It simply supports connecting the user to the point of authentication and simplifies the user experience.

I agree that no computer system is inherently secure. If credentials are stolen or leaked, single-sign-on solutions are not secure anymore. I’m not sure how easily IP addresses can be spoofed on the Internet, because an attacker would probably need to convince many routers that they need to send packets to a different location. Routers are getting better security steadily.

IP addresses are indeed PII. But so are many other things, like email addresses, (user) names and even opaque user identifiers that are often shared from IdPs to SPs. And if SPs use tracking cookies or other technologies that are not strictly necessary for keeping a user session alive, user privacy is gone.
Tracking technology is a Web thing, not related to IP-based or SAML or SeamlessAccess authentication. So the question how SeamlessAccess promotes privacy is not answered.

I can think of a way: IdPs (i.e. institutions) only release a session-limited user identifier (I believe each SSO session needs a user identifier) and an assertion that allows the SP (e.g. content publishers) to check that the user is allowed to access a resource (e.g. read an article). At the same time, SPs must be forbidden to use any tracking technology that allows them to identify the user beyond the session. Is this part of SeamlessAccess?

“SPs must be forbidden to use any tracking technology that allows them to identify the user beyond the session” isn’t part of Seamless Access (as far as I know), but it is part of a federation’s policies. For instance—in InCommon’s case—the [Baseline Expectations for Trust in Federation](https://www.incommon.org/federation/baseline-expectations-for-trust-in-federation/) says (at the “Baseline Expectations of Service Providers” header):

* Controls are in place to reasonably secure information and maintain user privacy
* Information received from IdPs is not shared with third parties without permission and is stored only when necessary for SP’s purpose

In as far as that trust between the IdP/SP/Federation parties holds up, the information is secure.

Web tracking technology is, of course, another matter entirely. My interpretation of these baseline expectations is that SAML attributes and web tracking tech don’t mix, but perhaps that is impossible to know unless one is on the inside of the service provider’s systems.

Comments are closed.