Editor’s Note: Today’s post is by Krishna Kumar Venkitachalam. Krishna is an orthopedic surgeon by qualification with 15+ years of experience in academic publishing under various roles, including manuscript editing, manuscript writing, editor training, author education, language technology solutions, and innovation. Currently, he works as the Innovation Officer for Trinka/Enago, part of Crimson Interactive. Note: The author discloses a potential conflict of interest due to their professional role in projects relating to technologies and themes in this article.
Open Access Week 2024 is a good time to reflect on the challenges faced by an important component in the open access ecosystem — preprint servers.
Open Access and Preprints
The open access movement emerged in the early 1990s with the rise of the internet. Online scholarly publishing became a reality, rooted in the vision of providing universal online access to academic works without financial or legal barriers. Interestingly, preprints predate the open access movement. Offline preprints, in the form of physical mailed copies, have existed since at least the 1960s. The establishment of arXiv in 1991 ushered in the online preprint server movement, which created a new and easy way of sharing and discovering pre-peer review records.
Yin and Yang
Preprints neatly align with the principles of open access. They allow researchers the opportunity to share their findings almost instantly, offering a space to accelerate the pace of scientific discovery and dissemination. Technically, preprint servers offer researchers a chance to get quick feedback from the academic community and subsequently improve their work before formal peer review. Unfortunately, the scientific community has yet to fully embrace this promise of engagement, with community commentary present in only a minority of submitted preprints.
Preprints serve as early publications that allow the academic community to engage with research before a potentially lengthy delay in publication after peer review. However, the absence of vetting in preprints raises concerns about potentially flawed papers slipping through the cracks. This duality highlights the need for a balanced approach to leveraging the benefits of preprints while mitigating their inherent risks. The downside is that there is much potential for poor-quality science to be disseminated and circulated. The consequence of such dissemination is compounded by the fact that preprints are often picked up by news media, which then publicizes unvalidated results.
The COVID-19 pandemic is a good example of both the benefits and drawbacks of preprint servers (see also here). Preprint servers helped disseminate the latest findings quickly but also sparked controversy regarding the quality and reliability of the information shared. As researchers rapidly uploaded findings to preprint platforms, submissions were sometimes misinterpreted or misrepresented, leading to misinformation and conspiracy theories. This prompted widespread discussion on social media despite the papers being withdrawn shortly after publication due to scientific concerns.
New Threats & Solutions
So far, we have assumed that submissions are from honest, well-meaning researchers. However, we now live in a world where paper mills — nefarious for-profit entities producing legitimate-looking content — can produce submissions in large volumes, crossing ethical, legal, and moral boundaries. These entities can flood the preprint server ecosystem with bogus studies, making it harder to find genuine research.
Initiatives like Plan S and Plan U are shaking up the traditional publishing landscape by advocating for full and immediate open access to scholarly publications funded by public and private grants. They promote the use of preprints as a standard practice, which could further complicate matters in terms of submission volumes.
Publishers and journals are stepping up their game by adopting automated screening and integrity checks, which can screen submissions for aspects such as plagiarism, ethics and integrity, language quality, and metadata. It’s a significant step forward in maintaining high standards in academic publishing, helping to weed out the sub-par submissions before peer review.
Preprint servers also implement screening processes to ensure that submissions are within scope and free from offensive or non-scientific content. The screening typically involves two main steps: an initial review by in-house staff to verify submission details and a subsequent evaluation by volunteer experts who assess the potential public health implications of the research (at least for biomedical preprint servers). While this process is not as rigorous as traditional peer review, it does help to maintain a level of quality and safety in the content shared.
There are many players in the field of automated screening and research integrity checks for publishers and journals, and more are emerging. So why not implement similar checks for preprint servers? These systems could help the staff and volunteer teams screen submissions in a faster, more efficient way, improving the quality of submissions that make it onto preprint platforms. Taking it a step further, preprints that have undergone screening could even be awarded a “green channel” for faster submission to traditional publishers, speeding up the publication process without compromising on quality. This would not only enhance the reliability of preprints but also lighten the load for publishers, allowing them to focus on what sets them apart from preprints: peer review.
The Way Forward
To make this vision a reality, industry-wide collaboration is necessary. Researchers, publishers, and policymakers need to work hand in hand to integrate automated screening and integrity checks across all preprint platforms. Funding agencies should back initiatives that develop and fine-tune such tools, ensuring they are both effective and fair. Universities and research institutions also have a role to play by educating their researchers about the importance of maintaining integrity in preprints and encouraging the use of reputable and “verified” preprint servers.
Of course, these checks aren’t a silver bullet. These systems will have problems associated with technology and automation. They may carry biases from their training data, lack applicability across different research fields, and exhibit suboptimal levels of accuracy. To address these issues, a balanced approach that combines automated checks with human oversight is essential. This way, the bulk of the screening workload is managed efficiently, while nuanced cases receive the attention they deserve from experienced reviewers.
Implementing automated checks also requires substantial financial and technical resources. The financial sustainability of preprint servers is evidently precarious, as many rely on temporary funding from philanthropic organizations and commercial partnerships. Implementing these checks with human oversight can increase the operational burden on servers, necessitating a balance between ensuring quality and maintaining rapid research dissemination. Potential solutions to address this include endowment funding from funders and support from the state for sustaining open research infrastructure. Smaller preprint servers may struggle even more with the costs, potentially leading to unequal standards across platforms. Preprint platforms could collaborate to share technical expertise and resources, reducing the burden on individual servers.
The Gates Foundation has taken the first steps toward this future with the launch of Verixiv, their new preprint server that runs submissions through a lengthy list of pre-publication checks. Currently, however, it is only accepting research funded by the foundation. It remains to be seen whether such initiatives can be scaled up for broader coverage and whether the necessary funding to do so will be available.
Despite these challenges, the integration of automated screening and integrity checks is likely the only scalable solution for the long term. Ideally, these measures will help operators uphold higher standards of research quality and integrity, and as the technology behind these checks advances, the screening workload for preprint server operators should also diminish.
Conclusion
To conclude, as preprints become an increasingly integral part of scholarly communication, it is imperative to bolster their screening processes with automation. Embracing such measures will ensure that the quest for open access and rapid dissemination does not come at the expense of integrity and trust. With these improvements, preprint servers can continue to thrive and serve their functions without succumbing to the deluge resulting from bad actors.
Discussion
2 Thoughts on "Guest Post — Upholding the Integrity of Open Science"
There’s no doubt that these screening tools offer some defence against flawed or fraudulent research, they’re ultimately treating the symptoms rather than addressing the root cause of the problem.
The rise of preprint servers and initiatives like Plan S and Plan U are changing the dynamics of scholarly communication. The traditional reliance on publishers as gatekeepers of quality is bypassed, arguably shifting the responsibility towards individual researchers and the institutions that support them.
Simply investing in more sophisticated screening tools for preprint servers is like putting a band-aid on this systemic issue. What’s needed is a more profound change in the roles, responsibilities, and measures of success for all stakeholders in the research ecosystem:
1. Redefining Institutional Responsibility:
Mentorship and Training: Invest in robust training programs that equip researchers with the skills and knowledge to successfully navigate open science, including data management, ethical considerations, and public communication.
Rethinking Evaluation: Move away from solely relying on publication metrics and impact factors. Develop evaluation frameworks that recognize and reward open science practices, such as data sharing, code availability, and public engagement.
Incentivize Open Science: Develop clear guidelines and incentives for researchers to publish in Plan P compliant journals, thereby supporting the broader open access movement.
2. Empowering Funders:
Mandate Open Science Practices: Funding agencies should make adherence to open science principles a mandatory requirement for grant recipients.
Protocols: Require or incentivize the publication of research protocols as part of the grant application or reporting process.
3. Cultivating Researcher Responsibility:
Embrace Transparency: Researchers must be proactive in sharing their data, methods, and code, allowing for greater scrutiny and reproducibility.
Engage with the Community: Actively participate in peer review and community based feedback mechanisms to improve the quality of research outputs. A fraction of researchers take on the bulk of this critical work – work that expands a researcher’s knowledge, improves their skills, and helps to uphold the quality of research.
Uphold Ethical Standards: Adhere to the highest ethical standards in research conduct and dissemination, ensuring data integrity and responsible reporting.
The transition to open science demands a collective effort. By rethinking the roles and responsibilities, and by embracing a culture of transparency and accountability, we can ensure that the pursuit of open access leads to more robust, reliable, and impactful research – and maybe less (no) need for these tools in the future.
I will make one comment. I am a reviewer on several journals. Most research papers are worthless. Ones that challenge the scientific centre of gravity of a belief are usually rejected. So science cannot progress. This is particularly true in my area, radiation and health. What I do is let everything through unless it has serious problems, but usually even then. Readers can themselves see what is good and what is not.