Editor’s Note: Today’s post is by Leslie D. McIntosh. Leslie is the founder and CEO of Ripeta, a company formed to improve scientific research quality and reproducibility. The company leads efforts in automating quality checks of research manuscripts and is a recipient of funding from Digital Science. She served as the executive director for the Research Data Alliance (RDA) – US region and as the Director of the Center for Biomedical Informatics at Washington University School in St. Louis. Over the past years, Leslie has dedicated her work to improving science. Since 2014, this has focused on highlighting the need for reproducible science, then on transparently reporting science, and now on the need to build trust in science. She holds a Masters and PhD in Public Health with concentrations in Biostatistics and Epidemiology from Saint Louis University and a Certificate in Women’s Leadership Forum from Washington University Olin’s School of Business.
An assumption at the heart of the scientific publication process is that manuscripts are authored by credentialed scientists. But, what if that is not the case?
An ‘author’ refers to ‘the creator or originator of an idea’ which ‘conveys significant privileges, responsibilities, and legal rights’ (see COPE’s “What Constitutes Authorship?”). Some journal and society guidelines explicitly prohibit the use of fictitious author names in publications, yet challenges in the integrity of authorship continue to surface – through paper mills, fake peer-reviewers, and non-existent co-authors within published papers. Although we are in need of aggregated trend data on the extent of such cases, we now have evidence of what I call imposters and impersonators in preprint authorship.
Imposters, by definition, are not who they say they are; they are a fictitious persona, not based on a known person. An impersonator, on the other hand, takes the identity of another and uses it as their own. Both operate within the realm of science yet produce different consequences. An imposter produces scientific-looking work ostensibly to have an idea treated as trusted research. An impersonator also does this but under the mask of a known, credentialed individual. Both actors may contaminate scientific processes, discussions, and outcomes, but the impersonator also potentially damages the reputation of a verified scientist.
The prevalence of fictitious authorship across preprints is still unknown, and the writers’ motivations are opaque in most cases. This nefarious behavior within the open science arena raises many questions in need of discussing.
As many know in the scholarly community, multiple platforms offer mechanisms to quickly disperse preprints as part of open science research. Some services primarily host research manuscripts, while some support more diverse scholarly work (e.g., images, datafiles). They all vary in their quality checks.
Indicators of Trust
Whether a preprint or a published article, many practicing scientists have internal checks for trusting and questioning research publications. They discern their trust in the work through various high-level checks such as viewing and verifying the author’s institutional affiliation, checking the citations, and investigating the author’s other published works. This informs the level of trust of the science, which will be further scrutinized.
Trust is manifested not only through the presence or absence of individual identifiers (e.g., ORCID IDs, DOIs) but in the practice of sharing science. Because of the newness of the preprint and repository integration within the scientific ecosystem, there are no established guidelines to only using one platform per paper. Yet, this practice of multiple postings on multiple platforms — when continuously used — appears to break the social contract of trust of open science. Placing your working material somewhere public before you publish offers transparency, duplicating work conjures up suspicion.
Thousands of manuscripts have been posted on preprint platforms in 2020, with many papers gaining significant media attention (Kwon, 2020). We have seen the good that comes from rapid scientific communication through faster vaccine development, as well as the ills through the overhyped and rushed science (think Hydroxychloroquine) (Gautret, et. al., 2020). With this much attention on preprints, it is logical to think that preprints are by and for scientists and the work can be trusted. However, that is not the case. There are imposters and impersonators on these platforms submitting work to mimic scientific research.
Curious Cases of Imposters and Impersonators in Covid Preprints
Let us walk through one example to illustrate how open science practices have been manipulated through fake authorship. At the time of writing this post, Kira Smith presents as an accomplished researcher with a medical degree and expertise in over 50 areas according to the multiple author profiles across platforms (e.g., figshare, Academia.edu, and OSF), including medical virology, aerospace engineering, signal processing, and biochemistry.
Smith has nine (9) unique ‘research’ papers across at least eight (8) platforms for a total of 28 papers available, all published in 2020 (more case study detail found here). All but one of Smith’s uploaded works on ORCID pertain to COVID; the one other item is a table describing opioids. The publications are on multiple established generalist repositories that support preprints including figshare, Authorea, and SSRN. There is at least one ‘published’ paper in the suspected predatory publisher openventio.
The same paper may be distributed among multiple platforms but not all papers are on one. A few of the papers are translated copies between two languages and counted as the same paper. Each paper has scant citations, and the citations present are a mix of published scientific work both within and outside of the COVID literature as well as non-scientific websites.
As with Smith, there are a number of authors who have become ‘experts’ in COVID with prolific publications. They have uploaded a number of articles on one or multiple preprint repositories all within a 12-month period and many have populated ORCID profiles. These authors cannot be verified as existing as scientists before last year.
Publishing only on COVID during 2020 might be expected, however, not having any other publications and then writing multiple manuscripts within one year on one topic seems odd, even for a new researcher. The authors have few or no publications on any topic prior to COVID; they have no publications in peer-reviewed journals; and, they appear to be independent researchers. However, they leverage a few areas of open science practices and identifications that act as a proxy for trust in open science: (1) Posting preliminary work publicly on known platforms that mint article DOIs, and (2) Populating a unique author identifier profile through ORCID. Both posting to a platform and getting an ORCID ID requires no verification; one just needs an email address.
Trust in Open Science
Known challenges within peer-reviewed research include fake peer reviewers, paper mills, and falsified institutional affiliation (see Bachelet, et. al, 2019, Bik, 2019). Other ethical issues plague research science as well, such as author attribution and duplicate publication submissions. While COPE offers guidelines and advice to tackle these latter issues, in correspondence with COPE Council members, they have not yet discussed fictitious authorship in preprints.
Due to challenges of authorship, we began automating author checks at Ripeta as part of our quality checks on research manuscripts. We found that for preprints, in particular, we need to check if the author is a scientist. This can be a slippery slope because of the need to define ‘scientist’ and the specific qualifications needed for publication. We need to explore what trust in an author looks like. What credentials should or do researchers need to have? ORCID IDs, academic institutions, established companies, previous peer-reviewed publications? They all come with benefits and drawbacks to offer as a credential for a scientist.
Since beginning this investigation, Kira Smith and others have updated their profiles on multiple platforms, multiple times, and have even been cited in published literature. While I can guess at some of the motivations for creating and sharing the papers, these would only be guesses. It is hard to explain the actions, but it is not hard to say that none of the articles should be trusted. Yet, the questionable preprints have been downloaded hundreds to thousands of times.
This brings me to some questions — How do we define trust in open science? What should be done about the gaming of the scientific ecosystem? What are the expected norms of behavior in open science? What is the protocol for reporting and removing work and profiles from open science sites? And, who should make those decisions?
These questions offer ample fodder to begin discussing checks and balances. Now is the time to build a coalition – to foster credibility and integrity into the open science ecosystem. Disparate perspectives will provide a balance of solutions and would include stakeholders supporting repositories and preprint services; publishers, institutions, and funders; and, ethicists and supporters of publishing ethics. Moreover, this discussion must inclusively encompass global drivers of open science.
Through this alliance within open science, our community can build and check the trust in science and scientists.