While the use of preprints (public posting of an early draft of a paper before it’s submitted to a journal for formal review) has long been established in fields like physics and the social sciences, recent uptake in the biomedical world has raised some concerns. When clinical treatment and public health are involved, extra care must be taken to ensure that it is clear to the reader that the work being described has not been peer reviewed. Most preprint servers handle this well, watermarking their preprints and clearly labeling them as preliminary. But little thought seems to have been given to how we cite preprints. Should we treat them the same way that we treat reviewed and published material?
Where you stand on this largely depends on the purpose that you think reference lists in papers are supposed to serve. If you see them as providing empirical support for any statements made in the paper, then the inclusion of preprints in citations likely worries you. An author could make a dubious claim in a preprint that sees no editorial oversight or review, and then cite that claim as an accepted belief in the field in a subsequent published paper. If you see reference lists as a set of links providing further information, then inclusion of non-peer reviewed material isn’t a big deal, caveat lector.
A journal I work with was recently publicly criticized because they asked an author to remove a preprint from the list of references on an accepted paper. Their reference policy is a traditional one, espoused by many other journals — anything that goes into the reference list must have been peer reviewed. Anything that has not been peer reviewed is treated as a “personal communication” and can be referred to in the paper, but is noted as such. I’ve often heard preprints compared to the equivalent of giving a talk about unpublished work at a meeting, so there is some logic in treating them both the same way when referring to them in a published work.
I asked Richard Sever from biorxiv about this and while he conceded that this might have been the “traditional” policy, it is one that has largely become outdated:
I’m not sure that anyone really adheres to that view because for decades people have been happy to include theses, books, editorials, and, more recently, websites, data, and code in those Reference lists, none of which are peer reviewed. I think people sometimes think that but only because they forget all the above that they routinely include. Personal communications are different. There is nothing to actually point to there so it doesn’t make sense to include them in a reference list.
One benefit of citing preprints is that many (often most, depending on the field) are eventually formally published in a journal. Linking to the preprint version, assuming it is on a reputable preprint server, will bring the reader to an updated version that prominently displays a link to the peer reviewed, published version.
The tide seems to be flowing toward more inclusion in reference lists. As the way we communicate research results continues to evolve, it makes sense that our policies toward those communication channels should continue to evolve as well. But as with all things in the scholarly communications sphere, we should strive for clarity and transparency. If we are going to include different types of materials in reference list, then we should make clear to the reader what they represent. We need a set of easily recognized standards for how this is done.
The National Institutes of Health (NIH), in their document discussing the use of preprints to report interim research results from grants, proposes the following form:
To cite the product, applicants and awardees must include the Digital Object Identifier and the Object type (e.g. preprint, protocol) in the citation. Also list any information about the document version (e.g. most recent date modified), and if relevant, the date the product was cited.
Example: Bar DZ, Atkatsh K, Tavarez U, Erdos MR, Gruenbaum Y, Collins FS. Biotinylation by antibody recognition- A novel method for proximity labeling. BioRxiv 069187 [Preprint]. August 11, 2016 [cited 2017 Jan 12]. Available from: https://doi.org/10.1101/069187.
These requirements help reviewers understand that the product is public, interim, and identify the specific version is being referenced.
Cold Spring Harbor Press, the publisher behind biorxiv, agrees with this methodology, and includes the label “PREPRINT” after the server name in citations in their journals. There’s a flexibility in this approach, where any new type of object could be clearly labeled in the reference to alert the reader (e.g., WEBSITE; BLOG COMMENT; etc.). One concern though, is that the reader doesn’t get this information unless they dig down to the reference. The casual reader may continue to skim along, and assume that the concept holds as much weight as any other reference in the paper. One way to resolve this would be to include the descriptor in the reference callout in the text (e.g., Smith et al., 2018 PREPRINT).
Other suggestions include putting non-peer reviewed material into a separate reference list. This would create greater editorial overhead though, as one would need to carefully delineate which references go where, and some, such as a book where one doesn’t know the review history, would remain ambiguous. One could also use the different DOI category for a preprint to automatically create some different way of displaying the reference (different colors for different sources?), but again, this may be difficult to standardize (and would entirely go away when someone prints out a copy of the PDF on a black and white printer — yes, people still do this).
No matter the final chosen standard, this means extra work and extra expense for journals. We know we won’t be able to rely on authors to implement these changes, so that means careful review of manuscript reference lists to flag and characterize any suspect citations, or at least the expense of building automated tools to do so, and the time and effort needed to run them. As Tim Vines pointed out recently, advocates for change and new ways of publishing have a habit of unintentionally creating these sorts of editorial headaches.
Regardless, the use of preprints continues to roll forward and other than wreaking havoc in financial markets, so far seems to present benefits that outweigh the concerns raised. More work needs to be done though, and we need a broadly accepted standard for citation of preprints and how this is made clear to the reader. I’m told that COPE is working on a set of “best practices” for preprints which should be out soon and will help push the conversation forward. The role of publishers is not to try to prevent change, but to adapt to change and find ways to preserve quality, transparency, and trustworthiness of the scholarly literature, and preprint citation is an opportunity for us to provide responsible stewardship.