Editor’s Note: Today’s post is by Joseph DeBruin. Joseph is the Head of Product Management at ResearchGate, where he applies the scientific training he learned as a neuroscientist at Johns Hopkins toward building products for ResearchGate’s network of users. He was formerly Head of Growth at Feastly, which was acquired by ChefsFeed in 2018.
“Always design a thing by considering it in its next larger context – a chair in a room, a room in a house, a house in an environment, an environment in a city plan.”
– Eliel Saarinen – Finnish/American architect
The world has gotten pretty opinionated about how scientific communication should be designed, and most of what has been published has fallen into one of two camps:
- Camp A) The Covid crisis has torn down the walls of science and cranked the speed dial to 12. Instead of traditional journal publishing which takes months, preprints are exploding, “a global collaboration unlike any in history” is happening in real-time, and an old system is finally getting the overhaul it needed!
- Camp B) We are seeing the first true social “infodemic.” Misinformation is everywhere, most of what is out there “isn’t even science,” and governments are cracking down on social media platforms and scientific publishers to dramatically limit the content that makes it online.
Both of these camps are at least partially correct, but few articles address the fact that speed and uncertainty in science are often two sides of the same coin, and getting the benefit of speed without the risk of uncertainty is extremely challenging. To be sure, focusing as a journalist entirely on the benefit of speed or the damage of misinformation likely reflects a desire to keep an article clean and to the point. Nonetheless it’s important to note that peer review with sequential disclosure, the system that has historically mitigated risk from uncertainty, is exactly the system that is being upended in order to increase the speed of innovation. So although it’s complex, we owe it to science to dig in this rocky middle ground, and to ask the question of how new technologies and insights can allow us to maintain the benefit of established truth-seeking systems while continuing to push the boundaries of speed.
Spheres of relevance
Scientific communication is a many-layered onion, to put it mildly. Knowledge creation starts in an extremely tight unit such as a lab which, like your immediate family, speaks its own language given how much shared history and common knowledge it has built over time. There are often very few other labs in the world who might have use for the truly “raw” output of a given lab. Then as you move outward, there are scientific disciplines, sub-disciplines, related disciplines, and on and on to cover the millions of scientists in the world. To make it more complicated, within each of these groups you can find senior scientists who have 40 years of experience interpreting scientific output and new scientists who are still in their first year of training. And then there’s the wider public including policy experts, journalists, and interested lay-people of all kinds. As the knowledge created in that first lab moves and travels, it needs to change in certainty and in style. Those other 10 labs might want raw data as soon as it is recorded, while the broader discipline isn’t interested until it’s been verified and can safely change the current assumptions of the field. The wider public needs something different entirely, namely a view of how this science can change their lives, and that requires a great deal of translation, context, and safeguards from the layers within. As a simple version, let’s say it looks something like this:
It’s worth noting that this is not even close to the right scale. If it were, it would look something more like this, where the scientists in relevant fields and experts in the topic are so small as to be invisible:
The point to be made here is that communication within science is a series of complex steps outward, and even so, it reflects only a small fraction of the flow outward to the most distant layers of public awareness. Armed with this mental model, we can look at how information has flown through the spheres in a few specific cases within the Covid crisis.
Case studies of misinformation/innovation in Covid
The Ibuprofen rumor
How it happened:
- March 11: Researchers in Switzerland publish a letter to The Lancet observing that in 3 small case studies in China, a high proportion of severe cases also had high blood pressure and/or diabetes. Because Ibuprofen is a common treatment for these issues, it was suggested as a potential risk factor. The authors “encourage someone” to conduct a proper study. Worth noting is how misleading scientific language can be to non-scientists: “We suggest that patients with cardiac diseases, hypertension, or diabetes, who are treated with ACE2-increasing drugs, are at higher risk for severe COVID-19 infection.” “We suggest…are” = we hypothesize, but haven’t proven or tested anything.
- March 14: The French Ministry of Health issues a warning that people should not take Ibuprofen. Almost immediately it is picked up by news outlets in the US and elsewhere, and before long the anti-Ibuprofen craze is on.
- March 16: The Lancet researchers react to the misinterpretation of their letter by publishing an update stating that that it was “expressly formulated as a hypothesis” and should not be interpreted to mean anything for patients. Fair to say this update didn’t do much to stop the momentum.
The Wuhan lab conspiracy
How it happened:
- Feb 5: A researcher in China, Botao Xiao, uploads a preprint to ResearchGate pointing out the proximity of a lab in Wuhan that had reportedly studied coronaviruses in the past to the Wuhan meat market. The paper suggested a causal relationship between studies in the lab and the crisis, but provided no experimental evidence whatsoever.
- Feb 5-15: Several sites pick up the preprint as evidence that China has been covering up the origin of the virus or even intentionally created it. On Feb 15, Xiao removed the preprint from ResearchGate and deleted his account. Later he told The Wall Street Journal (paywall) that he removed it because it “was not supported by direct proofs.”
- Feb 15-current: despite the fact that no material proof has emerged to support this claim, it is still being cited by many influential people as evidence of malicious intent from China. Tucker Carlson of Fox News cites it as a “draft paper” which is “now covered up” and as specific evidence that he is not creating conspiracies but citing “real science.” This claim, as with Xiao’s, appears to also be without direct proof of any kind.
Characterizing the novel virus
How it happened:
- Jan 23: Researchers in Wuhan share a preprint on bioRxiv (one of the major preprint servers) characterizing the outbreak and (among other things) describing the similarity in structure to SARS-CoV and to bat coronavirus
- Jan 23-Feb 03: The preprint is instantly taken up by researchers across the world. To date it has over 250,000 downloads on bioRxiv. Within days it was cited by several other studies, mostly preprints. The preprint alone has almost 100 citations. It was also covered by a wide host of news outlets.
- Feb 03: The preprint was formally published in Nature. It now has almost 1000 citations from scientists around the globe.
A couple of things are worth noting in the above examples, which represent how a similar mechanism can be both innovative and potentially dangerous. First, you see how fast the information travels from the inner core of scientists to the wider circles. Second, it’s clear that as we drive more visibility on earlier stage research it means we not only publish results with huge amounts of experimental validation but also hypotheses and ideas based on few observations. Researchers have always known this early idea-building work to be essential to the advance of science (hence the push towards more and faster visibility!) but the public often doesn’t think of this as any different from any other type of “science” which means it can be interpreted as more certain than it really is.
Last, it’s clear that in none of these cases did it travel in linear/sequential fashion from inner circle to increasingly wider groups; because the information is public from the start, it travels from the inner core of scientists to other relevant scientists who can advance the work while simultaneously making the leap to news outlets and “citizen scientists” on Twitter who are ill-equipped to handle the inherent rawness and uncertainty of a preprint. As Richard Sever, co-founder of bioRxiv, said in a recent webinar, “preprints are intended for experts.”
Temporary problem or lasting shift?
A major open question at the moment is how much of what we’re seeing now is unique to this crisis versus where Covid has catalyzed a lasting change in how science operates. To be sure, the attention of the entire public sphere on science is temporary: I think we can all expect the quantity of scientific articles our uncle John, the lawyer, sends us will decrease after the current crisis is behind us. But faster sharing of early-stage (read: more raw) research and increased openness have been two of the major topics in scientific communication for years. Just look at the growth of preprints since 2013.
Facilitating better information travel via technology
The point I’ve tried to make so far in this article is that speed and uncertainty in science are two sides of the same coin, and that the trends we see in Covid are indicative of a lasting shift rather than a temporary blip. Nonetheless, there are big steps forward we can take towards mitigating the risk associated with limited validation while maintaining speed, especially with the help of modern technologies.
It seems obvious, but one of the biggest things that can be done to reduce the risk of misinformation is to surface high-quality research. Researchers in the UK modeling information spread in crises found that simply changing the ratio of quality information to misinformation read by a population, or “immunizing” a small percentage (20%) of the population to misinformation dramatically reduced the health risk.
Promoting high-quality early-stage research is by no means a trivial effort. Journals have historically been relied on to curate scientific output, and so before a publication makes it to a journal (or if it never does) there are few existing mechanisms that can mimic the human effort needed to ensure that it isn’t simply the splashiest articles that rise to the top. Especially in the early stages of research, new platforms for science will need to put researchers at the center and allow them to gain or lose trust and influence over time, just as journals have. It will be critical to establish the right structures so that experienced researchers are rewarded for the knowledge they curate as well as the knowledge they create. There’s a lot to be learned (both bad and good) from platforms like Twitter in terms of how to establish the trustworthiness and prestige of individuals and allow them to gain influence in content curation. We’ve been working on this for years at my own company, ResearchGate, and are trialing new user curation mechanisms within our Covid Community, but it’s a major shift for science and something that we believe will take time and cooperation with publishers to ensure that the new systems also strengthen the existing ones.
In addition to promoting the best research, it’s clear that we have to minimize and mitigate the times when people consume low-quality or potentially misleading research. One strategy to “immunize” is what we’re seeing with services like bioRxiv or ResearchGate where a clear warning is placed on top of any content that is potentially more uncertain (such as preprints) due to the lack of peer review. Another is working to establish flows within preprints to keep that content from ever appearing in the first place. Major preprint servers have dramatically increased their moderator mechanisms, where researchers approve all content before it gets added. As a user content platform, ResearchGate does itself not directly assess content or pre-approve content, but has set up user driven mechanisms designed to include relevant content in the COVID 19 area, and encourage helpful user commentary on specific works.
Nonetheless if we look back at the examples above, it’s clear that sometimes the problem isn’t that the information is false or malicious, but simply lacking the right context for certain audiences. This isn’t elegantly handled by the binary yes/no of a moderator system, not to mention the fact that scaling moderator systems faces many of the challenges of the peer review system that is such a burden on journals; it’s difficult to find enough researchers to do the work and even more difficult to find ways to properly celebrate the work these researchers do behind the scenes. Moderators can and should flag possible pseudoscience, but we can also explore ways to selectively expose information to the right audiences so that it gathers necessary context as it travels to outer layers. And when content changes (for example, when it moves to a journal and goes through peer review) we should explore mechanisms to make sure that everyone who read the previous versions gets notified of any changes. Establishing these spheres and notification loops for different context triggers within a platform is challenging, but we believe something worth doing.
Science seeks truth
This is a truly challenging topic, for science and for the internet in general. What should give us hope that science can succeed in ways others haven’t is that science has truth seeking built into its DNA at every level. Something doesn’t “become science” only when it “has a sample size of more than 1000 people” (as some would suggest)” or when it goes through a randomized double-blind clinical trial. A hypothesis based on observation, as we have seen in Covid preprints, is no more or less science than a physical law like gravity that has been tested and validated experimentally for centuries. That doesn’t mean we can treat these the same way as they travel along the path of knowledge towards truth. Many of the old systems that have slowed down the pace of science in order to establish relevancy and truth-seeking as it builds from observation to intervention shouldn’t be thrown away in the move towards speed, but rather built into and strengthened via new technologies.