Editor’s Note: Today’s post is by Brian Lavoie, Senior Research Scientist with OCLC. Brian’s research interests include computational analysis of library data, the evolution of the scholarly record, and the organization of library services and infrastructure.
The scholarly record is evolving to incorporate a widening range of research outputs, with stakeholders, systems, practices, and norms both adapting to and shaping this evolution. Stewardship of research data has received particular attention, evidenced by an ever-thickening network of services, resources, and consensus- or standards-building activities dedicated to making data sets accessible and reusable. One prominent initiative is FAIR: a set of principles that describe how to make data sets Findable, Accessible, Interoperable, and Reusable. It is still early days for FAIR – the principles were introduced in a 2016 article in Scientific Data. The future of FAIR is therefore very much to be determined; however, publishers, funders, researchers, and other stakeholders can draw some helpful lessons from history.
Despite its relatively recent appearance, FAIR has had an impressive impact – for example, the communique from the 2016 G20 summit included support for the FAIR principles, while the European Commission has endorsed FAIR as a key element of the European Open Science Cloud.
At this point, the FAIR principles are just that: a set of principles, or perhaps more accurately, a set of aspirations. The FAIR principles are articulated at a very high level: for example, Interoperability requires that “(meta)data are richly described with a plurality of accurate and relevant attributes”; Reusability requires that “(meta)data meet domain-relevant community standards”. As one might expect, FAIR’s release – and subsequent calls for “FAIR data” – gave rise to a cottage industry of reports and initiatives aimed at unpacking what implementation precisely involves. The European Commission’s report Turning FAIR into Reality is a recent example, and states clearly the challenges involved:
Implementing FAIR is a significant undertaking and requires changes in terms of research culture and infrastructure provision. These changes are important in the context of the European Open Science Cloud and the direction for European Commission and Member State policy, but go beyond that: FAIR requires global agreements to ensure the broadest interoperability and reusability of data – beyond disciplinary and geographic boundaries.
To date, there is no such thing as formal FAIR compliance, nor is there a widely accepted view on the practical realities of implementing FAIR. Obstacles abound. For example, different disciplinary interpretations of what FAIR means may emerge. If this occurs, the practical reality of implementing FAIR may “fork” at the discipline level, making compliance, governance, and standards-building around FAIR much more complex – not to mention interoperability, one of the four pillars of FAIR itself. As inter-disciplinary research continues to expand, there will likely be a parallel growth in the use of data sets from multiple disciplines. This will be complicated by different disciplinary “flavors” of FAIR, which would facilitate the flow of data within disciplines, but not necessarily across disciplines.
This is not to say there has not been progress toward implementation: for example, GO FAIR, an organization working toward implementation of FAIR, is sponsoring several “Implementation Networks” which are working across several fronts to move FAIR closer to implementation. Similarly, reports like Turning FAIR into Reality are sketching the contours of what a FAIR implementation might look like in practice. So ground has been broken, but there is still a long way to go.
Foretelling FAIR’s evolution: the example of OAIS
Where will FAIR end up? What will be its value to research data management (RDM) stakeholders? To see into the future, we might start by looking into the past: in particular, the development of the OAIS reference model.
OAIS is a conceptual view of what an Open Archival Information System – essentially, a digital archive – looks like. It also describes – again, in a highly abstract way – the information packages (content and metadata) that move into, out of, and reside within an OAIS-type system. It is not a blueprint for building a digital archive, and it is agnostic about implementation choices.
As OAIS gained in visibility and influence, the notion of OAIS-compliant digital preservation services entered the scene. It was not long before services began to declare themselves compliant with (or conformant to, or based on) OAIS. While a statement of compliance with OAIS seems quite important, the reality was more nuanced. Since OAIS offers a high-level description of a digital archive, rather than a blueprint for a precisely defined implementation based on a specific bundle of standards, protocols, and best practices, OAIS compliance is necessarily high-level as well. For example, conformance might mean an explicit use of OAIS concepts, terminology, and models in the design of a repository architecture; it could also mean that OAIS concepts are recoverable from the implementation by mapping implemented archival components to their corresponding OAIS concepts.
There have been efforts to codify the high-level OAIS concepts and relationships into formal standards – in regard to the OAIS information model, for example, we have PREMIS (a data dictionary for preservation metadata) and METS (an XML-based encoding format for metadata associated with a digital object). However, use of these standards is not required for OAIS compliance. Like FAIR, the OAIS reference model offers a conceptual view of the problem space at hand, several layers removed from a concrete specification of practical application. There are advantages to this approach – but also costs.
Jerry McDonough sums up the trade-off nicely:
“The digital library community seems to face a dilemma at this point. Through its pursuit of design goals of flexibility, extensibility, modularity and abstraction, and its promulgation of those goals as common practice through its implementation of XML metadata standards, it has managed to substantially impede progress towards another commonly held goal, interoperability of digital library content across a range of systems.”
In short, highly abstract models like FAIR and OAIS have the virtue of flexibility in implementation, at the cost of consistency and uniform implementation. A 2017 study by Dunning, et al, that examined the state of “FAIRification” in a number of repositories, seems to suggest that a trade-off of this kind may loom large in FAIR’s future, concluding: “The FAIR principles are not just about compliance. Some of the facets need to be seen as open-ended guidelines that can be interpreted in different ways; and varying interpretations can all be within the spirit of the original guidelines.”
Given my experience following the trajectory of OAIS’s impact in the digital preservation community, I do not see this as a problem.
A few years ago, I partnered with the UK Digital Preservation Coalition to publish a Technology Watch report on OAIS. Here is how I summed up the OAIS legacy, as of 2014:
“Perhaps the most important achievement of the OAIS reference model to date is that it has become almost universally accepted as the lingua franca of digital preservation. The concepts and terminology articulated in the reference model have become a useful shorthand for digital preservation practitioners; a means of shaping and sustaining conversations about digital preservation across disparate domains; and a general mapping of the landscape that stewards of our digital heritage must navigate in order to secure the long-term availability of the digital materials in their care. These three legacies of the model: a language, a shared reference point, a map –did much to consolidate knowledge of the problem space occupied by digital preservation at the time that this issue was attracting the notice of information managers across a wide range of domains.”
What does this suggest for FAIR?
As the example of OAIS shows, we should not underestimate the value of concepts and vocabulary that bind a community around a shared issue. Efforts to transform FAIR into something that is formalized, implementable, and certifiable will continue, and they may be successful. But even if these efforts fall short of being realized, there is still substantial value in convening a diverse set of stakeholders – funders, universities, researchers, etc. – around high-level principles that catalyze productive conversations.
Changing data management practices is just as much about changing mindset and culture as it is about technical solutions – perhaps more. FAIR is a valuable tool for advocacy, in the sense of communicating the high-level goals of open, reusable data. FAIR is a valuable resource for education, by providing a shared framework within which new perspectives on responsible data management can be formed – even if those perspectives are not uniform, or easily operationalized. And FAIR is a valuable marker for how seriously the community is taking up the issue of open data: even if repositories declare their data FAIR without formal compliance or certification protocols, at least they are gesturing to the importance of the issue, and maybe even doing something substantive about it.
So the experience of OAIS tells us we should not place all our emphasis on formal implementation of FAIR as the final yardstick of its value to the community. FAIR can be, and I expect will be, a powerful catalyst in moving the research data community as a whole in the right direction.
That, it seems to me, would be a FAIR-ly good outcome!
Acknowledgements: The ideas in this post were prompted by recent discussion within the OCLC Research Library Partnership (RLP) Research Support Interest Group. The author thanks his colleagues Rebecca Bryant and Merrilee Proffitt for thoughtful comments and suggestions on several earlier drafts.
1 Thought on "Guest Post — The Future of FAIR, as Told by the Past"
Brian, Thanks for sharing this interesting comparison — and it’s good to hear you are bullish on FAIR.
One important consideration for research data is how we define the “community” around which sharing and services should be designed. My Ithaka S+R colleagues Danielle Cooper and Rebecca Springer argued that we need to think in terms of data communities, often at a sub-disciplinary or field-specific level:
Data communities need certain standards of their own, and even so it is enormously difficult to breath life into data communities. Here is an example from the emergent Spinal Cord Injury community:
FAIR is compatible with the idea of community “flavors” for data-sharing. Whether working from a publisher, library, or other perspective, those interested in research data sharing, management, discovery, and preservation can benefit from accounting not only for FAIR principles but also data communities in developing data infrastructure and services.