Journalism, Preprint Servers, and the Truth: Allocating Accountability

The four years of the Trump administration have been painful — indeed, traumatic — for a great many people, for a great many reasons. One source of distress has been the administration’s unprecedented assault not only on the truth itself, but also on the idea that truth matters more than political expediency. This has created an unusual challenge for journalists, who have always had to deal with politicians whose relationship with factuality is, shall we say, complicated, but who have never encountered an administration that misrepresents facts and actively advances falsehoods so constantly, so brazenly, and so reflexively.

We have to acknowledge, of course, that undermining the authority of “facts” and “objective truth” isn’t a phenomenon that originated with the Trump administration. The propositions that “reality” is whatever we all agree it is, that there is no such thing as historical fact, that “objectivity” is merely a pretense used by the powerful to defend their interests, and that the putative search for “truth” is really just a tool of oppression have been significant currents of postmodern and critical academic discourse and teaching for several decades. (Go back a bit further, of course, and you have Foucault asserting that “reason is the ultimate language of madness”; earlier than that, there’s Nietzche: “The real truth about ‘objective truth’ is that objective truth is a myth”.) As President Trump has constantly attempted to twist or reverse the truth to fit his agenda, it’s been interesting to hear voices from quarters that once characterized objective fact as a myth and reality as a social construct now calling us urgently to stand up against Trumpism’s offenses against objective fact and reality. (To be very clear, none of this is to say that, as some have argued, postmodernism itself is to blame for Trumpism — though the mental image of the President consulting a volume of Derrida or Irigaray while composing his counterfactual tweets is kind of fun.)

Be all that as it may, for the purposes of this post let’s take it as given that there is such a thing as objective truth, and that it matters what the truth is. Furthermore, let’s stipulate that factual claims can generally be confirmed or debunked by appeal to empirical evidence, and that it therefore matters whether evidence supports the claim that America’s voting machines were infected with algorithms created at the behest of the late Hugo Chavez, or the assertion that Republican observers were barred from vote-tallying facilities, or the claim that Hillary Clinton’s campaign manager ran a child sex-trafficking ring out of a pizza parlor. If we can all agree, for the sake of argument, that there is such a thing as objective truth, that it matters, and that it can generally be established by appeals to evidence, then we can proceed with the questions I’d like to address in this post.

The questions are: what is the media’s responsibility to sort truth from error in public discourse, and what does this have to do with preprint servers?

Question 1: What Is the Media’s Responsibility to Sort Truth from Error in Public Discourse?

One thing the past four years have shown us is that if the media simply report things that political leaders say, and leave it at that, they may be doing only part of their job. When the statements of those leaders are expressions of debatable opinion or are factual statements that have some reasonable correspondence to the truth, reporting them without editorial comment has been the traditional journalistic approach, and is arguably the right one; it seems entirely appropriate for journalists, in their role as seekers-out and reporters of facts, to avoid putting their thumbs on the scale of genuine public debate. But what about when a powerful leader is saying things that are patently and dangerously false? President Trump, his proxies, and his formal spokespeople have created this dilemma to an unprecedented degree, and after a certain amount of understandable thrashing around and hand-wringing, the mainstream media have eventually settled on the strategy of characterizing his most blatantly false claims as just that: think about how many times you’ve read or heard sentences in news outlets over the past few years with qualifications like “the president claimed, falsely” or “the president asserted, without evidence.” It seems now to be broadly accepted that even when what’s being reported is not the presumptive truth of the statement but merely the fact that a public figure made the statement, there are circumstances in which the dangerous falsity of the statement itself really does need to be flagged.

But for those of us who accept that premise, a genuinely difficult question arises: whom should we trust as public arbiters of what is and isn’t patently and dangerously false? Once we clear the way for reporters to characterize blatant falsehoods as such, who will draw the line between blatant and dangerous falsehoods and assertions with which the reporter simply strongly disagrees?

That troubling question notwithstanding — and recognizing that everyone won’t agree on where such lines should be drawn — it does seem to me that the line currently drawn by the mainstream news media represents a pretty reasonable distinction between what can be reported without comment and what needs to be flagged as a clear and potentially dangerous falsehood.

So why are we discussing this in The Scholarly Kitchen? That brings us to the second question:

Question 2: What Does This Have to Do with Preprint Servers?

Preprint servers, to which scholars and scientists can post preliminary reports of their research for public comment before submitting them for formal publication, aren’t intended to fill the same function as journalistic venues. While they’re open to the public, submissions to preprint servers are presented not as established science for public consumption, but rather as tentative findings for open discussion, mainly among other experts in the field.

Except when they aren’t.

A growing problem in the scholarly and scientific community is a population of opportunists who try to use preprint servers as a place to post crackpot pseudo-science and misleading public health information, all under the flag of scholarly “publishing.” They submit articles to preprint servers in the hope of publicizing them, counting on both an uninformed public and a too-credulous press to treat the reports as if they were vetted and peer-reviewed science published in a venue that is willing to accept responsibility for them. Just as predatory publishers have recognized in the APC funding model an opportunity to lie and make money, mendacious authors have recognized in the preprint-dissemination model an opportunity to lie and achieve political goals or professional advancement.

Here we see a direct connection to the journalistic issues raised above. Since the difference between publication in a peer-reviewed journal and “publication” in bioRxiv or medRxiv isn’t immediately obvious to non-specialists, journalists are a prime (and intentional) target for what amount to political scams: unscrupulous scholars and scientists (or people posing as such) posting papers to preprint servers and then touting them as having been “published.” Journalists who may or may not know better then report on these studies as if they represented vetted science.

How bad is this problem? A recent search of newspapers of record like the New York Times and the Washington Post suggests that these generally do a good job of identifying posted preprints as such, and making it clear that what they’re citing are unvetted scientific claims. I found that the Fox News website does this less well; articles with references to bioRxiv and arXiv often include qualifiers like “awaiting peer review,” but are just as likely to say things like “published in bioRxiv” or (worse) “published in the preprint journal arXiv.”

The time has come for those who manage preprint servers to take a firmer hand in vetting the claims that are posted there and to consider retracting preprints when the public good requires it.

A more troubling set of data points suggests a larger and deeper problem, though: over the course of several recent posts in his newsletter The Geyser*, Kent Anderson has provided compelling evidence that white nationalists are disproportionately using unvetted preprints to promote pseudo-scientific racism; that alt-Right (and former Trump administration) figure Steve Bannon used CERN’s open-science platform Zenodo to amplify Dr. Li-Meng Yan’s dangerous conspiracy theory about COVID-19; and that other shady figures on the alt-Right have been taking significant advantage of the low barriers to “publication” that are a defining feature of preprint servers, and making disproportionate use of those venues to seed the public conversation with false and misleading claims designed specifically to push hateful and divisive narratives under the guise of “science.” The data and patterns he describes are startling and, in my view, worth serious consideration.

What can be done? I would suggest that just as the mainstream media (and, more reluctantly, social media platforms like Twitter and Facebook) have gradually come to the conclusion that they have a responsibility to flag obvious and potentially dangerous falsehoods as such when they appear on their platforms, the time has come for those who manage preprint servers to take a firmer hand in vetting the claims that are posted there and to consider retracting preprints when the public good requires it — recognizing that while the purpose of a preprint server is not primarily to serve as a dissemination or “publishing” platform, what affects the public welfare is not whether it’s intended to be used that way, but whether it is used in that way. In this context, it’s worth noting that despite multiple calls to do so, Zenodo has never removed (or even flagged) Dr. Yan’s COVID-19 conspiracy theory, despite its thorough, repeated, and public debunking. Similarly, a thoroughly debunked study that purported to find a causal connection between cellphone use and brain cancer remains on the bioRxiv site — where it is presented without editorial comment — and it continues to be cited. A deeply flawed study purporting to show similarities between COVID-19 and HIV was posted on bioRxiv early this year, and was eventually withdrawn by its authors following severe criticism by the scientific community. The term “withdrawn” is rather ambiguous, though, as the article is still on bioRxiv (though flagged with a banner indicating that it has been “withdrawn”).

In fairness, it should be noted that bioRxiv, and medRxiv both currently have banners at the top of their pages, warning users that the preprints do not represent peer-reviewed science and should not be cited as such in the media or used to guide clinical practice — and they’re making efforts to catch bad science before it’s posted. These are steps in the right direction. Given the incredibly high stakes involved during the COVID-19 crisis, however, it does not seem sufficient; on both platforms, all reports are still presented as if they’re on an equal factual footing, regardless of whether they’ve been seriously challenged or even completely debunked since being posted. And Zenodo offers no disclaimer at all — in fact, its main page leans in the other direction by noting, in a sidebar, that Zenodo currently “prioritizes all requested [sic] related to the COVID-19 outbreak” and offering to help researchers with “uploading (their) research data, software, preprints, etc.” Nowhere does it suggest that there will be any attempt either to detect or to flag (let alone retract) dangerous medical misinformation.

I should point out here that I’m actually generally a supporter of preprint servers and of the open and public discussion of preliminary scientific and scholarly findings. (Disclosure: I have served for years as an unpaid member of the advisory board for bioRxiv.) But like all dissemination models and systems, preprint servers don’t only solve problems; inevitably, they also create them. In a circumstance in which science is more highly politicized than normal and the stakes are incredibly high — such as during an unusually dangerous pandemic that is being weaponized by political actors — the problems with “publishing” unvetted science do come into dramatically sharper relief, and raise questions that need urgently to be asked and resolved.

* The Geyser posts to which I’ve linked in this paragraph are normally available only to subscribers, but will be publicly open for 24 hours beginning the evening prior to this post.

Rick Anderson

@Looptopper

Rick Anderson is University Librarian at Brigham Young University. He has worked previously as a bibliographer for YBP, Inc., as Head Acquisitions Librarian for the University of North Carolina, Greensboro, as Director of Resource Acquisition at the University of Nevada, Reno, and as Associate Dean for Collections & Scholarly Communication at the University of Utah.

Discussion

18 Thoughts on "Journalism, Preprint Servers, and the Truth: Allocating Accountability"

Spot on! Thank you! And thank you also for employing, even if in quotation marks, the term preprint publication rather than “preprint,” with the not so subtle allusion that, despite being made public, it is not really deserving of that word.

By Donald Forsdyke
Dec 14, 2020, 8:21 AM

This article left me confused. The author proposes a problem in an ill defined way. He then picks at few brief and incomplete aspects of it. This is not a sufficient analytical discourse following data, analysis, synthesis.

By Tom Dobbie
Dec 14, 2020, 8:33 AM

So, the implication is that the model of dissemination is best served by good editorial oversight in order to remove the worst consequences that might arise…

Maybe that is the underlying message; that many good things in life (open preprints), just as good journalism, are hard or impossible to do when it has to be free…

By Dk
Dec 14, 2020, 9:25 AM

I do not believe it is difficult at all. Reporting the news is just that; educating the public on what happened or what was said. It immediately stops being news and becomes editorial when journalists weave their (or their editors’) opinions into the story with clever word smithing.

By Oie Osterkamp
Dec 14, 2020, 9:40 PM

Presumably believing that reality is a social construct is not the same as believing that reality or truth, therefore, ‘doesn’t matter’. If the strawmen postmodernists from the introduction of this piece believe that socially constructed reality is a “tool of oppression”, they still have a moral stake in its construction. This is probably why they are now “calling us urgently” to assert a better (more truthful?) reality.

I think Rick Anderson makes some interesting points in this piece on an increasingly necessary topic, and I’m sorry to only focus on the introduction, but I don’t think the contributions of critical theory, and hence much of the humanities, are a constructive target. The CEO of a certain major academic publisher performed some similar contortions fairly recently, accompanied by some memorable graphics. It comes across as somewhat contradictory to the project we in scholarly comms should be working towards. Those of us admirably combating falsehoods, conspiracy theories, etc, in scholarly comms and elsewhere, have more to gain from “a volume of Derrida or Irigaray” than we have from dismissing them. (Perhaps I’ll read them myself some time.)

By Charles Whalley
Dec 14, 2020, 9:50 AM

Charles, I think you misunderstand what straw man argumentation is. It’s not quoting actual people in a way that accurately represents their positions; it’s creating a fake or distorted version of someone else’s argument and responding to it as if it were the actual argument. A good example of straw man argumentation would be, for example, responding to my introduction as if I had said “believing reality is a social construct is the same thing as believing that reality or truth, therefore, ‘doesn’t matter'” (complete with misleading quotation marks).

By Rick Anderson
Dec 14, 2020, 10:06 AM

Definitely not agreeing to Tom’s comment.
The writing is reflecting facts and truth at its best.

By Fazal Ghani
Dec 14, 2020, 10:10 AM

Preprint servers enable rapid sharing of research, so scientists can learn from each other and work more efficiently to make advances. This has been particularly apparent during a year in which bioRxiv and medRxiv have posted 11,000 pandemic-related preprints. Requirements for transparency are placed on authors, every submission is screened by scientists, and 20-40% of submissions are declined. Every post alerts readers to the findings’ preliminary nature. Onsite comments on papers are encouraged and offsite commentary is linked.

As SK readers know, the use made of scientific content, from preprint servers or peer-reviewed journals, is not controlled by those who manage them. Journalists are learning fast how to cover preprints responsibly.

The founders of bioRxiv and medRxiv always welcome constructive suggestions on how these platforms can better serve their research communities.

By John Inglis
Dec 14, 2020, 10:25 AM

Thanks, John. Two suggestions for bioRxiv and medRxiv:

1. When a paper is withdrawn (like the COVID/HIV study), withdraw it from the server rather than just tagging it “withdrawn.”

2. When a paper has clearly been shown to be dangerously misleading (like the cellphones-and-rats study still up at bioRxiv), either explicitly flag it as such or withdraw it from the server.

Both of these seem to me to be within the control of those who manage the servers. What do you see as the arguments against taking these steps?

By Rick Anderson
Dec 14, 2020, 10:57 AM

Rick, I think you mean “remove” the article rather than tag it as withdrawn. Content on bioRxiv and medRxiv is indexed very quickly and downloaded extensively. Removing a paper from a server does not remove it from the internet. We think it’s better that content discovered to be unreliable is clearly and permanently flagged as such, along with the explanation about why it’s been withdrawn and the community’s reaction to the content. Then everyone can see what happened and why, on the article page, in a transparent way. You described the HIV resemblance paper you mentioned as being “eventually” withdrawn. That process happened in just 48 hours, over a weekend, a voluntary act by the authors after a deluge of critical comments onsite and on twitter pointed out errors of interpretation on their part. This immediacy of response is a credit to the research community and something journals find hard to do.

The cellphone radiation paper is a completely different case which has been inaccurately discussed ever since it was posted. The paper carries with it the internal peer reviews at NIH that pointed out in unambiguous terms how weak the data were. There was a lot of ill-informed media comment about the paper but just imagine how much worse that would have been if based on suspicion that NIH was concealing something of such public concern. Instead the paper was put on bioRxiv so anyone could read and judge the data, the conclusions, and the expert commentary for themselves. The paper is still available on bioRxiv because it should be, in the interest of scientific accuracy and informed public opinion.

By John Inglis
Dec 14, 2020, 11:55 AM

Thanks, John — and you’re right, when I said “withdraw it from the server” I meant “remove it.”

I think this would make an excellent topic for a formal debate at Charleston or R2R.

By Rick Anderson
Dec 14, 2020, 12:01 PM

This cellphone radiation paper was shopped to all the major medical journals (and many minor ones), from what I’ve been told, and uniformly rejected. However, because the NIH had spent $25M on the study, the researchers felt it needed to get out and be promoted. Hence, the preprint. It was a bad-faith effort to circumvent editorial review, statistical review, and peer-review, as at that point the authors had no intention of taking the paper to a journal. They basically abandoned it on bioRxiv, a respository of last resort, and received a surprising number of perks publishers traditionally have delivered (DOI, brand, discoverability, citability). On top of that, they received a defender of their actions under the guise of “in the interest of scientific accuracy and informed public opinion.” I can’t help but feel you were punked.

There were other examples Rick didn’t mention, with the oleadrin paper and its association with the “My Pillow” guy among my favorites. Companies are also using preprint servers to introduce products and make their efforts appear to have a scientific basis.

As for “removing” a paper, your responsibility is first to bioRxiv and medRxiv, not the whole Internet, so let’s start there. By policing and managing the preprint servers, you make them better. That should be reason enough. Better yet, you can affect distribution on the Internet — discoverability, especially. So I don’t know why you’d resist making the preprint servers better and starting to limit discoverability for trash papers.

One design improvement preprint servers could make (I’ve proposed at least five [https://thegeyser.substack.com/p/5-ways-preprint-servers-could-improve], and there are others I’ll probably write about soon enough) — retire preprints after a reasonable time (2-3 years) if they’re not published in a peer-reviewed journal. Why give authors a way to publish, promote, and archive unreviewed or rejected papers?

When you examine whether a preprint represents a valid scientific or intellectual claim, the jury is most definitely “out” (https://thegeyser.substack.com/p/can-a-preprint-claim-a-claim). The head of the FDA this summer had to disclaim preprints as a source of evidence the FDA would rely on, as well. The majority of medical researchers have profound concerns about preprints in the press misleading the public. In short, the scientific community seems to be identifying issues with preprints, and it would be advisable to get ahead of them.

Preprints are fine in the way they were intended to be used — as temporary drafts shared among selected colleagues for the purpose of pre-submission review and improvement. A preprint platform capturing those norms is entirely feasible. However, platforms that make preprints permanent, allow them to be promoted with abandon, and don’t force authors to subject the works to trusted intermediaries adhering to scientific norms are ripe for abuse. I’ve been documenting these abuses, and they are become more shameless and systematic as misinformation peddlers are realizing preprint servers are soft targets, even as journalists are learning — my impression is that preprints grabbed journalists’ attention early in the Covid-19 pandemic, and the good ones sniffed, went “eww,” and have largely stopped covering them, while right-wing media outlets have been feasting on them.

I think what Rick touches on (and I’d agree) is that it’s time to address the design shortcomings of Preprint Servers 1.0, and redesign these platforms to reflect the fact that authors can operate in bad faith, that open information can be exploited to sow misinformation, and that with confusion as one of the goals of misinformation peddlers, a confusing scholarly information space gets meddlers and miscreants halfway there (yes, I’m looking at you, NLM/PubMed/PMC/Medline).

We can do better.

By Kent Anderson
Dec 14, 2020, 12:39 PM

Good journals typically have a pre-review process for submissions, in which an editor reads the manuscript for, among other factors, apparent basic soundness. Only then does the journal invest in full peer review. This is the critical eye that is lacking in the preprint servers. If we’re going to have preprints at all, we need to find a way to incorporate editorial pre-review by reputable journals.

By Ken Lanfear
Dec 17, 2020, 2:24 PM

Though it’s worth noting, as John Inglis pointed out above, that bioRxiv and medRxiv do both impose that level of screening, with the result that between 20% and 40% of submitted preprints never get posted. As I said in the piece, this is an important step in the right direction. To my knowledge, Zenodo does not impose this kind of screening — though I’d be interested in hearing from someone at CERN as to whether that’s actually the case.

By Rick Anderson
Dec 17, 2020, 2:39 PM

What biorxiv does is closer to a “sanity check” than the sort of work that an editor does on a desk reject though. If I recall correctly, for biorxiv, they give it a read to ensure that it is actually “science”, is not completely crazy, and doesn’t violate any of their rules (no review papers for example). That doesn’t mean they review it for whether it is true or accurate or of any value whatsoever. medrxiv has a more involved pre-review process, aiming to avoid potential patient harm.

Regardless, Ken’s comment brings up a question I’ve been asking for years — when does a “preprint” become a “published article”:
https://scholarlykitchen.sspnet.org/2017/04/19/preprint-server-not-preprint-server/
PCI (Peer Community In…) comes to mind here. Through their services, and author submits an article to their editorial team, the editorial team reviews it and if it passes muster, they put it through a formal peer review process and it is either accepted (“recommended”) or rejected. Somehow, they declare this to still be a preprint even though it has been through editorial peer review and a decision rendered.

By David Crotty
Dec 17, 2020, 3:27 PM

Keep in mind, the editor of a journal essentially has skin in the game. You’ve only got so much peer review capacity, and have to ask if this new manuscript is worth one of those review slots. The preprint server may not have the same incentive to screen out poor science.

By Ken Lanfear
Dec 17, 2020, 7:44 PM

Don’t think it’s accurate to say the trump administration or trump are against facts or truths
It would of been better if the author gave some consistent examples

By Fabbeyond
Dec 18, 2020, 5:41 PM

I think that by any measurement, this administration plays fast and loose with the truth. Perhaps you missed the first 23,000 plus instances:
https://www.washingtonpost.com/graphics/politics/trump-claims-database/

It would of been better if the author gave some consistent examples

Perhaps you also missed the multiple examples linked to in the post.

By David Crotty
Dec 18, 2020, 10:39 PM

The Scholarly Kitchen

Journalism, Preprint Servers, and the Truth: Allocating Accountability

Innovation Showcase Highlights Cutting-Edge Publishing Solutions

View photos from the 46th Annual Meeting!

Question 1: What Is the Media’s Responsibility to Sort Truth from Error in Public Discourse?

Question 2: What Does This Have to Do with Preprint Servers?

Rick Anderson

Related Articles:

Next Article: