A recent article in The Atlantic by David Freedman lays out, in a compelling narrative, the effects of rampant medical research publication driven by a publish-or-perish culture, egotistical authors, misaligned incentives, and porous peer-review.
If David Freedman’s name rings a bell, it might be because he’s the author of “Wrong: Why Experts Keep Failing Us–and How to Know When Not to Trust Them,” which I wrote a little about earlier this year.
At the center of the tale Freedman spins in the Atlantic article is John Ioannidis, a professor of biostatistics at a Greek university. He and his students have been analyzing all sorts of medical literature claims, and finding that most of them — even those emanating from the vaunted “gold standard” of the double-blind, randomized, placebo-controlled trial — are wanting in key aspects.
[Ioannidis] zoomed in on 49 of the most highly regarded [cited] research findings in medicine over the previous 13 years. . . . Of the 49 articles, 45 claimed to have uncovered effective interventions. Thirty-four of these claims had been retested, and 14 of these, or 41 percent, had been convincingly shown to be wrong or significantly exaggerated. If between a third and a half of the most acclaimed research in medicine was proving untrustworthy, the scope and impact of the problem were undeniable.
Freedman tangles and snarls concepts throughout the article, and consistently fails to question Ioannidis’ techniques or studies themselves. Freedman’s narrative is clearly one of Ioannidis vs. Them. For instance, if initial studies analyzed by Ioannidis’ work were heavily cited (i.e., clearly convincing to many people), yet are later retested and shown (“convincingly”) to be wrong or exaggerated, wouldn’t we always use the techniques that apparently generate better results (the ones used by the later studies)? Why weren’t those techniques used initially? Were the re-tests truly equivalent? Just how valid are those re-tests? Freedman’s narrative needs a protagonist, and Ioannidis is it.
Some of Ioannidis’ work bears closer examination. It seems chunked just-so. For instance, in one paper published in JAMA, studies were categorized as flawed because they were either wrong or exaggerated — but the concepts were conflated into one category. That is, it’s impossible to know if 13 of 14 were exaggerated and only 1 of 14 was wrong, or if there was some other ratio. Exaggeration is to some greater or lesser degree a subjective assessment, while being wrong is more objective.
It begs the question of how long do we need to re-test before we know whether the original findings stand up, whether the re-testing is flawed itself, and so forth.
Freedman weaves a nervous narrative by noting that many findings aren’t interesting enough to re-test — there’s not enough at stake — so they lurk in the literature like landmines. Then, even worse, he tell us that even very interesting findings are re-tested only about 1 out of 4 times:
Of those 45 super-cited studies that Ioannidis focused on, 11 had never been retested. Perhaps worse, Ioannidis found that even when a research error is outed, it typically persists for years or even decades. He looked at three prominent health studies from the 1980s and 1990s that were each later soundly refuted, and discovered that researchers continued to cite the original results as correct more often than as flawed—in one case for at least 12 years after the results were discredited.
This sounds pretty damning, but it’s more complicated than perhaps Freedman knows — certainly more than he’s willing to admit in his narrative. Of course, science works best when results are re-tested and either validated or refuted. But in medicine, many of these large, randomized, placebo-controlled trials are either the best or the last, meaning they’re of such scale and complexity that funding them, conducting them, and analyzing them takes years and millions of dollars. Repeating them takes perhaps another decade or more, and some are truly going to be the last study of their type, settling a question for all intents and purposes for a generation or more. Ioannidis talks about these limitations in his papers, but Freedman apparently missed them or chose to downplay them in constructing his narrative.
It’s also worth noting that while Freedman insinuates that researchers don’t understand the statistical software they’re relying upon or how it can give them the wrong answers, he only admires Ioannidis even though much of his research is built on a complex mathematical model of his own devising. Freedman gives Ioannidis a free ride.
This article from the Atlantic has allowed a platform for self-righteous bloggers over the past week or so, but ultimately it seems to be a story told by a reporter with a bias to see any limitation of medical research as a sign of absolute inadequacy, corruption, and selfish interests. Yet, he arrives at these conclusions by combining selective evidence with a lack of sophistication about medical publishing, in my opinion.
The propensity for journalists to drive a narrative over facts continues to be a plague on modern society — in politics especially, but throughout the public sphere. Competition narratives dominate, but suspicion narratives are a close second. Freedman’s article touches mostly on suspicion, but hints that competition is driving behavior we should be suspicious of.
In the face of all this, there is one point from Ioannidis which we should all ponder:
Science is a noble endeavor, but it’s also a low-yield endeavor. I’m not sure that more than a very small percentage of medical research is ever likely to lead to major improvements in clinical outcomes and quality of life. We should be very comfortable with that fact.