Editor’s note: Today’s post is by Iain Hrynaszkiewicz. Iain is Director, Open Research Solutions at the Public Library of Science (PLOS).
Research suggests that the kinds of “innovation” – and Open Science is still often labeled as such – that stick at large commercial academic publishers are the innovations that tend to achieve at least one of three things: Increases in revenue; reductions in costs; improvements to reputation. Given the challenges of business models for Open Science, and the additional burdens on workflows new or experimental policies and solutions can create, one can see how prioritizing Open Science initiatives, when faced with a large backlog at some organizations, could be challenging in the short term. However, Open Science is increasingly correlated with trustworthy and impactful science, underscoring that intelligent openness should, in fact, be seen as a matter of reputation.
But a further challenge remains. If the success of an innovation relates to the practice of Open Science – which at PLOS is about much more than reputation; it’s central to our mission – then what does success look like? And how do you measure it at the publisher scale? Indeed, to make progress towards any goal, good data are needed, including a view of your current and desired future states. Unfortunately, as recently as last year, there were no tools or services that could tell us everything we wanted to know, at PLOS, about Open Science practices. Benefits of Open Science – economic, societal, research impact and for researcher careers – are often highlighted, and to deliver these long-term benefits, measurably increasing adoption of Open Science practices is a prerequisite goal.
Principles and requirements
This is, in part, why we developed and have recently shared the initial results of our ‘Open Science Indicators’ initiative. We piloted a new way of measuring code sharing in the computational biology community in late 2021, and have since needed to scale – to more research communities and research outputs. In consultation with our Scientific Advisory Council we defined overarching principles and requirements for a broader set of Open Science Indicators that could be measured across multiple publishers’ content. PLOS then selected DataSeer as a partner to implement a solution following a Request for Proposals (RFP).
What are the overarching principles we established? (paraphrased here):
- Align with established community definitions or approaches wherever possible
- Measure what practices are being carried out now
- Ensure interoperability across diverse communities
- Be scale-able across large volumes of research outputs
- Share results of Open Science Indicators/ monitoring activities openly
- Use Open Science Indicators responsibly
What do these principles look like in practice?
- We used the FAIR principles and MDAR framework to help determine what we would like to know about research practices or outputs (data, code, preprints, protocols, etc) – to measure and understand trends.
- While the intention is to identify and celebrate good practices, we have taken an inclusive view of sharing practices that may not be considered FAIR-compliant. This includes measuring outputs shared via Supporting Information (a practice often criticized by open advocates because these files are less findable and accessible than content in dedicated repositories, but which nevertheless appears to meet many researchers needs, at least from their perspective)
- We are capturing information that helps us begin to understand aspects of diversity – geographic and subject area – and information that points to community specific practices, such as infrastructure names and types.
- We’re not the first to try and measure Open Science – there is a growing community of tool developers and meta-researchers that have inspired us – but we might be the first publisher to attempt this at this scale, making natural language processing and artificial intelligence a logical part of the solution.
- The dataset is available publicly. There are numerous ways to analyze and segment the data, and we welcome feedback on the approach we have taken so far, and suggestions for alternative approaches.
- We have chosen not to include Journal or Institution names in the public dataset, as these could potentially help facilitate rankings of these entities, and we have so far reported results at a PLOS-wide aggregate, and comparator cohort, level.
These principles and requirements have latterly formed a timely contribution to UNESCO’s Working Group on Open Science Monitoring Frameworks.
Initial results and a call for feedback
The first dataset analyzes research articles published between the start of 2019 and the end of Q2 2022 (~61,000 PLOS research articles). A comparator set of publicly-available research articles (~6000, or about 10% of the PLOS sample) has also been analyzed to put the findings into context. Additional indicators and further data analysis on articles published from Q3 2022 onwards will follow in 2023.
Importantly, the approach we are taking can provide information on the prevalence (observed sharing rates) of Open Science practices and the potential for adoption of those practices. For example, while all research articles could potentially share a preprint, only some research articles generate code (in case you’re wondering, we now know that just over half of studies published in PLOS journals generate or use code that could potentially be shared).
The initial results are encouraging in many ways – trends are upwards for PLOS and non-PLOS articles – but our interpretation of them at this point is minimal and cautious. We first have important questions to ask the community about the approach we are taking to this “measurement problem”. Although we’re keen to share our experiences, we don’t claim to have created an Open Science monitoring standard to be readily adopted. There are numerous developments across scholarly publishing that should lead to increased adoption of Open Science practices, and more will undoubtedly be incentivized by the Nelson Memo and Implementation of UNESCO Open Science Recommendation. Examples include Springer Nature’s roll out of integrated data sharing services; GigaScience Press’ requirements for protocol, code, and data sharing to support reproducibility, and eLife’s shift towards a preprints-first model. Funders and institutions are also innovating, such as Aligning Science Against Parkinson’s stringent open science requirements, and the researchers-led (“grass roots”) activities of the International Reproducibility Networks.
Given this diversity of activity that appears to have a common desired outcome there is a need to start a conversation – a conversation about how we can use better quantitative, longitudinal evidence on if and how open science practices are being adopted. Ideally, this quantitative evidence would be combined with qualitative (such as survey) evidence to better understand how researchers’ reported attitudes and experiences correlate with detectable practices in publications.
PLOS admittedly has its own motivations for this initiative; the information is essential to support a strategy that aims to increase adoption of Open Science and to better understand the researchers we serve. But we know that reliable information on Open Science practices is valuable to others – funders and institutions, researchers, and publishers. Whether they are seeking a better understanding of their current state, an understanding of the impact of policies, or whether they are strategizing for change, it’s clear we all need better evidence to support these activities. And with better evidence, perhaps we can empower more of us to act to help make intelligent openness the norm.
Thanks to Tim Vines and Veronique Kiermer for their comments on a draft of this blog post.