The misbehavior of authors — one of the most intractable problems in scientific and scholarly publishing — reared its ugly head again last week, as SAGE revealed that it was retracting 60 papers after it detected a possible peer-review and citation ring built by leveraging false identities inserted into a reviewer database.
This is not the biggest retraction case in history. That record of 183 retractions belongs to Yoshitaka Fujii. The main tool used by these fraudsters — identity fraud — is also not new, having been used by Hyung-in Moon on his way to 60+ retractions.
The ring struck the aptly named Journal of Vibration and Control, which is devoted to studying perturbations and how to control them.
The statement by SAGE has a surreal aspect to it:
While investigating the JVC papers submitted and reviewed by Peter Chen, it was discovered that the author had created various aliases on SAGE Track, providing different email addresses to set up more than one account. Consequently, SAGE scrutinised further the co-authors of and reviewers selected for Peter Chen’s papers, these names appeared to form part of a peer review ring. The investigation also revealed that on at least one occasion, the author Peter Chen reviewed his own paper under one of the aliases he had created.
As many as 130 fake email accounts may have been involved, according to our friends at Retraction Watch, who have monitored the story closely.
This news started some emails flying around, with people wondering how or what could prevent things like this in the future. Could something validating peer review, like the new PRE-val system my company is launching, have helped? Could ORCID, which is built to disambiguate authors in all their roles, including editing and reviewing? Could CrossCheck, which is intended to flag large passages of text published previously?
It turns out that none of these would have much effect on the particular situation. Falsifying an ORCID is probably about as easy as falsifying any other form of online identity, especially with so many unclaimed ORCIDs in existence — once all legitimate authors are using them and all relevant papers are claimed, then it would be nearly bulletproof. Pretending to be an obscure author or three and assigning illicitly obtained ORCIDs to fake email addresses doesn’t seem all that difficult. Judging from the lengthy dispute procedures outlines on the ORCID site, it would take a while to clear up a discrepancy. I asked Howard Ratner about this, and he confirmed that until ORCID is more widely adopted, it doesn’t have enough data to provide a trusted barrier against exploitation by a devoted fraudster.
PRE-val would have scored these articles by trusting that the peer-review information was accurate and honestly derived. CrossCheck detects for plagiarism, not identity fraud.
The fact is that we trust our authors and reviewers, and there is no system currently available to stop things like this from occurring, beyond the risk of being caught, publicly humiliated, banished from academia, and so forth.
How high is the risk of ostracism even if an academic or researcher is caught cheating? Not that high, according to a study published last year, where the authors found that:
[w]rongdoing in research is relatively common with nearly all research-intensive institutions confronting cases over the past 2 years. Only 13% of respondents indicated that a case involved termination, despite the fact that more than 50% of the cases reported by RIOs [research integrity officers] involved FFP [fabrication, falsification, plagiarism]. This means that most investigators who engage in wrongdoing, even serious wrongdoing, continue to conduct research at their institutions.
Another problem with enforcement includes an inability to make institutions where fraud has occurred return grants and other funding secured in the process. This is all nicely spelled out in a New York Times op-ed by Adam Marcus and Ivan Oransky.
In the SAGE incident, the individual at the center of this has resigned from his post, and the editor of the Journal of Vibration and Control also resigned. That’s somewhat reassuring.
SAGE is to be commended on one hand for investigating this and pursuing the trail vigorously. Of course, prevention would have been better, and there is definitely going to be some damage to the journal and publisher brands.
The difficulty of detecting situations like this leads one to wonder whether this is the tip of an iceberg. After all, this is not completely unfamiliar author behavior, as noted above. In other instances of academic misbehavior, authors have added senior authors to papers without their permission and seen those papers published, editors have self-edited papers to publication in their own journals, and authors have published plagiarized papers.
It’s easy to blame authors, editors, and publishers for this, but as is so often the case, the incentives (their type and strength and positioning) are also an issue. Yet, whenever one of these situations arises, we rarely question the system itself — one that rewards publication to such a degree that people we’d expect to be able to trust due to their educational attainments and professional affiliations still feel the need to cheat, commit fraud, and exploit the trust economy of scholarly publishing.
While retractions account for 0.04% of papers published each year, the public perception issues and damage to the record is disproportionate, real, and regrettable. If we consider retractions in our industry to be comparable to the worst kinds of mistakes in other industries, we see that 0.04% is a high rate. For instance, if the airline industry had a 0.04% error rate, the United States alone would experience more than 100 plane crashes per month.
Despite initiatives that place other goals first, I believe our main challenge consists of how to maintain quality, trust, and integrity within the scientific record. Are we clever enough to answer this challenge?
Discussion
21 Thoughts on "Trust But Verify — Identity Fraud and Exploitation of the Trust Economy in Scholarly Publishing"
It should be possible to develop algorithms to help detect this sort of thing. For example it is my understanding that non-institutional email accounts played a major role. Detecting clusters or patterns of these sounds doable, fun even.
Beyond that, let’s not go overboard. Any reward system involving millions of people will experience some crime. It is wasteful to try to eliminate all of it. There is no evidence of widespread fraud in peer review. The system works well in that regard.
“Widespread” is not the issue. Persistent, repeated, and embarrassing is the issue. This kind of thing has a chilling effect.
I do tend to agree that the manuscript tracking system vendors might hold a key part of the answer here.
There may also be procedural deterrents at the editorial level but one must be careful not to overdo it. Too often minor crime leads to major burden, which is truly inefficient. Procedure design is a research issue, not something to be just thrown together. It is better if the computer can do it, rather than the editors.
I don’t know how to stop it. This entire system of scholarly publishing is built on trust. We trust that the authors are who they say they are and we trust that they did the work presented. We trust that the reviewers and editors are going to read and critique the papers and not share them, use them, delay them, or decline them for their own gain. We trust that conflicts of interest of all parties will be disclosed. We trust that the publishers AND the research community are teaching all these parties what is acceptable and what is not acceptable in scholarly publishing.
We can build tools like CrossCheck to help detect plagiarism and dual publication. We have tools for reviewing figures to make sure they have not been manipulated. We have policies and public ways to disclose wrong-doing when it is discovered. This is all very reactionary.
The part that I am having trouble with is that there is enormous pressure to speed the whole process up. A comment on Retraction Watch asked how it is possible for a “good” paper to be accepted and posted online in 17 days. How is that possible? That is the new requirement! And yet, we add CrossCheck, and duplicate submission checks, and MORE editors, reviewers, etc., adding time to the entire process. Reviewer time is being shortened. From what I can tell, many journals are seeing more and more papers submitted each year. All of this is a strain on the system and ripe for someone to manipulate.
To David Wojick’s response, emails from the commercial submissions systems often have trouble getting through institutional fire walls. Many reviewers and authors use gmail accounts. In fact, most of our editors do because it helps keep the journal work out of their institutional email inboxes, which are already inundated. Further, in some fields it is common to have practitioners or retired professors as reviewers and they don’t have institutional email and definitely not ORCID.
I did not say it was simple, which is what makes it fun. Breaking the code on patterns of human behavior is often a great little research project, I call it “meet the beast.” Clearly gmail alone is not sufficient to ring the alarm bell, but it may be an important component of the fraud pattern.
I would be very surprised if there were no discoverable pattern.
This is a problem that we have been careful to address in our procedures. The root of the problem as I understand it in this case (and in many of the others) is two-fold: the willingness of journals to use author-suggested reviewers and the failure to do the leg-work required to verify the identity of the owner of the email address that the editor is using. If someone is able to build enough of a fraudulent presence in a system, as appears the case here, then the authors may not have even needed to suggest the fake reviewers as time went on. After all, I’m sure those “reviewers” did everything required to remain in excellent standing with the editors.
In Rubriq we do not accept any suggestions of potential reviewers from the authors. All of the reviewers that we use are selected by our team with no outside influence. This not only helps address some of the issues of fraudulent reviewers, but also minimizes the chances that the manuscript is being reviewed by a real reviewer with a relationship with the author(s) that may lead them to be less than objective.
In addition, reviewers that apply to Rubriq are required to verify their identity by either corresponding with us from an official institutional email address or by sending us a scanned copy of an employee ID card. Once the reviewers identification is verified we are comfortable using a client-generated email account (Gmail, etc), but not until that point.
Both of these precautions come with an associated cost due to the increased amount of time that they require per manuscript and per reviewer for onboarding, so I’m not sure if they can be widely adopted in the industry. However, these are steps that we felt were important to establishing a trust-worthy peer-review platform now and moving into the future.
Some degree of openness around peer review would help. The year-end publication of reviewers by many journals is both an acknowledgement and a simple act of transparency (though could itself be falsified). At the other end of the scale is open publication of peer review history as in BMJ Open and elsewhere.
In terms of ORCID, the value in this context will be in ORCID iDs which have institutionally-validated affiliation information, rather than in the mere existence of an ORCID identifier.
I obviously have a horse in this race, but I do think initiatives such as PRE-val and ORCID and others will help. We might not be able to “catch” those who game the systems in this way, but we certainly can make it harder for them while at the same time supporting those who are committed to ethical approaches. The SAGE journal at the center of the latest scam is not a member of COPE. PRE-val would have shown that. I’m still trying to find out if the journal was a member of other organizations and used other tools which reflect best practices in peer review that we are encouraging. Yes, ORCIDs can be “faked” as well, but it would’ve been one more step the author(s) in question here would have had to take to implement their “game” and one more tool the journal could have used to screen the suggested reviewers.
While it is true that there is a low barrier to registering for an ORCID identifier, it is also the case that the registry allows for linkages with an individual’s educational and employment organizations (also using an identifier), and that these linkages can be validated by said organization. Some organizations, including Texas A&M, are requiring their graduate students to obtain an ORCID at the time of thesis submission, so that there is a validated and machine-readable link between the student (ORCID), their thesis (DOI), and their degree-granting organization (Ringgold/ISNI). Others, like Karolinska Institutet, are adding a validation during the ORCID creation and/or staff on-boarding process. This information can be accessed using ORCID APIs during the manuscript submission and review process. To ensure trust, is critical that workflows are designed to (a.) capture the ORCID identifier using authentication processes so that authors and reviewers can approve the information exchange, (b.) leverage links to persistent identifiers and source information stored in the author’s ORCID record, and (c.) send the ORCID-DOI links on to CrossRef when a manuscript is published. Howard Ratner is right that to have full effect, ORCID needs to be integrated into more systems–by publishers, universities, and funders. These things take time, but it is worth the investment to assist authors and publishers in both assuring credit and establishing trustworthiness.
One solution here is to take a few simple steps before sending a review request.
First, editors should never send review requests – these must be sent by someone who can see the prospective reviewer’s full history with the journal (this also stops people getting multiple requests in a short space of time, or when they’re already reviewing for you).
Second, before a review request is sent, the editorial offices should a) check for duplicate accounts for the reviewer within their system, and b) google the reviewer to check for any relationship with the authors (same institution, recent publications together etc). Check (a) stops the Hyung-In Moon ruse of making fake accounts for established people, and (b) flags up fake reviewers, as making 130 realistic researcher profiles is prohibitively hard. New reviewers and author suggestions should be triple checked to make sure they’re real and on the level.
Of course, that doesn’t stop every determined fraudster, but hopefully the level of effort required to pervert peer review would be an order of magnitude larger.
Wouldn’t the idea of doing a Google search on any reviewer you’ve not used before/don’t know personally be a big deterrent here? It’s one thing to make up a fake name and email address, it’s quite another to create a convincing online identity and history. The same search would give you someone’s real email address in case you’ve been given a phony one as well.
These are NOT simple steps. Duplicate accounts are the bane of every submission site administrators existence. For many journal editorial offices, taking these steps would require more staff or more time. Neither of those are available options.
I’ve got to disagree here. These are simple steps, and they’re a core part of the editorial office function. Once you’ve got a good system in place, they add 2-3 minutes per review request. By contrast, retracting 60 papers is really time consuming, and has the side effect of completely destroying the journal’s reputation and value.
What other industry would neglect to check that a key supplier was actually a real entity?
I tend to agree with Angela. In fact most of the suggested procedures offered above increase the burden far out of proportion to the problem being solved. I suspect the 2-3 minute burden estimate is quite low, but even if true when multiplied by 2 million papers it is a lot of time. Of course if there are other benefits than that is different story, but this is how isolated instances can create large systemic burdens. It is a common source of over regulation.
As Angela says, the existence of duplicate accounts is the bane of editorial offices, and these have to be continually removed. You get into the most trouble when you have multiple accounts for a reviewer (e.g. you send them multiple requests, use outdated emails etc), so it makes sense to eliminate them at the review request stage. Checking unknown reviewers is not a major addition, particularly if you restrict it to cases where the reviewer can created their own account (rather than the editor).
Beyond the good practice aspects, I think you’re also massively underestimating the damage that’s done to journal publishing when peer review fraud is allowed to go unchecked. Journal offices have to go that little bit further, as otherwise the apparent value of what we’re doing will be undermined by case after case of lapses in due diligence.
How many cases of peer review fraud were there in 2013? How much damage did they do?
That’s part of the problem. We don’t know. However, not knowing is a fool’s paradise, and if the public continues to absorb the inherent message of scandals like this — that scientists cheat, that editors and publishers can’t detect it, and so forth — the damage could be substantial, both to the dissemination of reliable information that is done well (the vast majority) but also to the scientific enterprise as a whole.
There is no reason to think that what we’ve detected is all there is. There is no reason to think that we won’t be retracting papers published in 2013 in 2016 after we’ve found another sort of malfeasance.
The point Is that we don’t know is not a good reason to take the kind of actions being proposed here. Nor is some vague concept of damaging public opinion. What public are we talking about and what difference does it make? Most people have no concept of a scientific journal.
Sorry but I come from a world where proposals for sweeping new rules and procedures are carefully analyzed.
There are no ‘sweeping new rules’ – these checks are basic good practice and have been recognised as such for decades. It’s up to individual journals to decide whether they take precautions to weed out fake reviewers and other types of fraud, and good luck to them if they decide it’s not worth the effort.
Interesting footnote on this scandal…