Guest Post: The STM Integrity Hub - Connecting the Dots in a Dynamic Landscape

Editor’s Note: Today’s post is by Joris van Rossum. Joris is Product Director of the STM Integrity Hub and Program Director of STM Solutions.

It has been two and a half years since STM launched the STM Integrity Hub and two years since the Scholarly Kitchen reported on the initiative. A lot has happened since then, so we felt the time was right for an update. Here we hope to answer the questions we’ve heard and to let the community know how the initiative has progressed.

Has the situation around research integrity changed in the last two years?

The pressure is definitely increasing. Last year saw a record number of retractions and publishers continue to receive submissions from papermills. Generative AI compounds those problems by making research misconduct easier through the fabrication of data, images and text. What is particularly concerning is that such fabricated data are largely undetectable with current tools, which are based on the detection of duplication and manipulation of existing content (think of plagiarism, or image manipulation or duplication). Another change that we see is that the general public has increasingly become aware of these challenges. Many outlets, including leading newspapers and magazines, have covered the problem of papermills and other forms of misconduct in the last two years.

Are there also positive signs?

Certainly! New initiatives and services are emerging, there’s a greater willingness to collaborate and data and information are increasingly being shared. Those trends are reflected in the STM Integrity Hub: We now have more than 35 supporting organizations (including all the major submission systems) and over 100 people participating in working groups, task forces, and governance structure — making it a truly community-driven initiative. We are also very happy with the increased collaboration with Research Integrity specialists, including some of the well-known sleuths in the community.

In addition to providing a platform for discussion and knowledge-sharing, the STM Integrity Hub develops and delivers technology to screen submitted manuscripts for research integrity concerns. The use of those services is growing very rapidly: we currently screen more than 20K submissions per month for various signals including for duplicate submissions across journals, publishers, and submission systems. That enables us to spot patterns at an unprecedented scale, and learn quickly.

The previous Scholarly Kitchen post discussed the initial launch of a “demonstrator”. How has development progressed since then?

We started with two applications: The “Paper Mill Checker Tool”, which lets integrity staff upload selected manuscripts to be checked for a variety of signals; and the “Duplicate Submission Checker Tool”, which supports the automatic checking of duplicate submissions across journals, publishers, and editorial systems. Note that duplicate submissions are not only a challenge for editors and reviewers, but are also a strong indication of paper mills at work.

The Paper Mill Checker Tool was the first service that we made available to publishers. Combining internal and external tools such as Clear Skies’ Papermill Alarm and the PubPeer database, it immediately drew a lot of interest. Based on the feedback that we’ve received from users, we have extended its scope to include additional tools and to detect a wider array of integrity issues: Currently, more than 10 signals are offered to publishers.

The Duplicate Submission Checker Tool started in the form of a pilot in October of last year. It’s a unique feature of the Hub that combines data across journals, publishers, and editorial systems. Adoption has grown very rapidly and we now have connected 12 publishers with more than 150 journal titles. Several publishers have integrated their entire portfolio with this application.

The results so far are very interesting: With a throughput of 20K manuscripts per month, the detection rate of a duplicate submission is over 1% — and that number is expected to grow when more publishers and journals come on board. The duplicate submission checker is now checking for duplicates on the level of metadata, but we are moving to full text this year, making use of a technology developed by one of the participating publishers. The success of the duplicate submission checker application, including its automated content feeds from editorial systems, led to the decision to develop this application into an automated background screening for submitted content that screens for a variety of signals, depending on the choice of the publishers. For example, publishers can screen incoming submissions for duplications, check if references appear in the Retraction Watch database, and run the content through the Papermill Alarm tool all in one go.

Besides further developing these applications, we are also increasingly working on integrating the various signals in the workflow of integrity managers and editors, which becomes pivotal given the scale at which we now operate!

Many more initiatives have launched since you started. There are start-ups like Clear Skies, technology providers such as Morressier are focusing specifically on research integrity, and Wiley recently announced its plans for a paper mill detection service. How is the Hub positioned in this space and how do you collaborate with others?

You can think about the STM Integrity Hub in three ways: A trusted infrastructure that integrates data flows from different journals across publishers and editorial systems, and connects those to various screening tools. Second, as a trusted environment where parties can collaboratively develop unique tools based on shared signals — for example the Duplicate Submissions signal that we spoke about above or a Watch List with fake domains or email addresses that is crowd-sourced by integrity specialists.

Thirdly, think of the STM Integrity Hub as a true ‘hub’ that connects third-party solutions to assessing submitted manuscripts from a broad array of publishers through common data pipelines. For example, as mentioned above, both PubPeer and Clear Skies have been integrated with the STM Integrity Hub since last year. We feel that this third role is critically important because it enables us to tap into the innovation power and creativity of an entire sector — and, clearly, none of us will be able to solve the challenges that we are facing alone! To bolster this aspect, we have initiated a pilot program that will allow us to quickly and consistently test new Research Integrity screening tools. With this pilot program, we aim to unlock creativity in the sector by providing solution providers with a pathway to quickly evaluate new concepts and connect with a large number of prospective customers. At the same time, this program provides publishers with a central, trusted space to evaluate potential new solutions in a standardized way. Our first pilot partner is Cactus Communications, and we are integrating some signals from their Paperpal Preflight Research Integrity suite into the STM Integrity Hub. We are welcoming other initiatives to contact us and discuss integrations.

These three facets enable the Hub to play a unique role in a vibrant ecosystem alongside other solutions and initiatives. For example, publishers including Elsevier, Frontiers, Springer Nature, and Wiley are active participants in the Hub while also developing their own in-house solutions. These things work very well alongside each other. In fact, several publishers are making their in-house solutions available to other publishers through the Hub.

“Publishers need to work collaboratively to act on a strategy of prevention and identification, keeping fraudulent research out of the scholarly record. The Duplicate Submission Tool is therefore central to uniting Wiley’s prevention efforts with those of other publishers. Through STM, publishers have worked together to share what we have learned with one another. We are now strengthening that commitment through this integration and financial support for these technologies.”

Liz Ferguson, Senior Vice President – Research Publishing, Wiley

“A huge amount of time and resources have been put into this initiative by many partners over the past two years and we are delighted to see new organisations joining the work. The importance of collaboration in addressing the challenges of papermills cannot be overstated. The Hub — as well as STM/COPE’s United2Act — are showing us what winning by working together looks like. We certainly won’t win this battle in isolation.”

Chris Graf, Director of Research Integrity, Springer Nature, and Chair of the STM Integrity Hub Governance Board.

How is the Hub financed?

The STM Integrity Hub is financially supported through contributions from STM and several of its members. We are working to introduce a fee-based sustainability model from 2025 onwards to recover costs.

What will you hope to say to us two years from now?

First of all, that many more journals, publishers, submissions systems, external solutions, and tools will be part of the Hub! More importantly, we hope that in two years from now, we will have demonstrated the effectiveness of these capabilities by publisher-provided data showing how the Hub is helping them stop issues ‘at the gate’, and that these capabilities will contribute to driving down the ratio of retracted publications to problematic submissions that have been caught in advance. And eventually, of course, we aim to reduce the number of submitted manuscripts with research integrity concerns. With that, we hope that we can make a meaningful contribution towards increasing trust in publishing, and ultimately, in the scholarly ecosystem that plays such an important role in today’s society.

Joris van Rossum

Joris van Rossum is Program Director of STM Solutions. Joris leads various programs and projects within STM Solutions, with a special focus on the STM Integrity Hub and AI. Before joining STM, Joris worked for several leading companies and organizations across the STM publishing space, including Digital Science and Elsevier, where he initiated and led a variety of important innovations and cross-publisher initiatives within the research and publishing ecosystem. Joris holds a master’s degree in biology and a PhD in philosophy.

Discussion

4 Thoughts on "Guest Post: The STM Integrity Hub — Connecting the Dots in a Dynamic Landscape"

Dear Joris and rest of the STM Hub team,

Happy to help you and the rest of the industry connect those dots around Research Integrity, fighting for quality of research output, rather than quantity.

On behalf of all my colleagues at Cactus, we are proud to be one of the first partners to be joining the pilot program and stay tuned as there are more checks and developments to come! ; – )

By Pablo Palmeiro – Cactus Communications
May 23, 2024, 9:57 AM

A good start for the STM Integrity Hub. However, I now see that the new “Duplicate Submission Checker Tool” may have been used by a journal I submitted to. I will not go into details. I will merely state that when there is an editorial invitation to revise and submit a paper that has been reviewed negatively, the editor should request the submitting author to formally and immediately state whether he/she wishes to take advantage of that offer (within the time-period that the editor specifies). Failing that immediate formal response from the author, the editor should no longer consider the paper as submitted to that journal and so inform the author. In this instance, there would be no charge of double-publishing if the author subsequently submitted to a second journal (within the time-period the editor of the first journal had specified for resubmission).

There are many other obvious corrections required for the DSCT to operate effectively. One of these is that, should there be an offenders list, those whose reputations might suffer should be appropriately informed and given an opportunity to disagree.

By Donald Forsdyke
May 23, 2024, 10:20 AM

Thank you for your comment. To be sure, the check for duplicate submissions happens only between manuscripts that are under editorial review. As soon an editorial decision has been made (e.g. rejected or accepted) it is removed from the reference corpus that newly submitted manuscripts are matched against.

By Joris van Rossum
May 23, 2024, 12:54 PM

Thank you. It is good that publication norms that have generally been followed are now being more formally set out. These norms include the avoidance of double-publishing. In my experience, the norms also include the general understandings that both reviewing and revisions after review will each take around 2-3 months. If those both fail, an author then submits elsewhere.

What I am saying is that an author’s intent to revise and resubmit to the journal of his/her first choice should be formally agreed between the parties at the end of the initial review period. If the author agrees, the manuscript will remain in the reference corpus. If the author does not agree, the manuscript will be immediately removed from the reference corpus, so allowing immediate submission elsewhere.

Thus, to avoid matching a new submission with those in the reference corpus, an author has to know that a manuscript previously submitted to a journal of first choice is no longer in that journal’s reviewing list. Editors of first-choice journals should be aware of the possibility that the tone of initial reviews may so annoy some authors, that they disconnect from the journal, without being aware that the editor assumes that a revision is under preparation. This is a recipe for unwarranted damage to reputation (placement on an official or unofficial offenders list), of which an author may be unaware.

By Donald Forsdyke
May 23, 2024, 3:01 PM

The Scholarly Kitchen

Guest Post: The STM Integrity Hub — Connecting the Dots in a Dynamic Landscape

Innovation Showcase Highlights Cutting-Edge Publishing Solutions

View photos from the 46th Annual Meeting!

Joris van Rossum

Related Articles:

Next Article: