citation needed
citation needed (Photo credit: Dan4th)

As a system for correcting science, article retractions represent a fast, democratic, and efficient mechanism for alerting scientists of invalid work, a new study of biomedical papers reveals.

The article, Governing knowledge in the scientific community: Exploring the role of retractions in biomedicine (Research Policy, March 2012) Jeffrey Furman, Kyle Jensen, and Fiona Murray report on a study of 677 article retractions identified in MEDLINE between 1972 and 2006. What makes their study unique from nearly all prior attempts to understand the nature of retraction was their use of a control group — a comparison against which to study the performance of retracted papers. Their control group was assembled by selecting nearest neighbors — papers published just before and after each retracted article.

Just as no control group forms a perfect comparison to the study group, the researchers also employed statistical techniques to estimate, and control for, the effect of several variables under investigation, like the geographic location of the corresponding author. Their results can be summarized in three main points:

  1. The retraction system is fast — While some false articles remain in publication for years before retraction, nearly 15% of all retractions took place during the year of publication and 50% within 24 months. There is also evidence that the delay between initial publication and retraction is getting shorter over time.
  2. Retractions are democratic — There does not appear to be bias due to geographical location or institutional affiliation. Outside of the United States, retraction numbers are relatively small, however, limiting their sensitivity of analysis.
  3. The effect of issuing retractions on article citations is severe and long-lived — Compared to control articles, annual citations to retracted articles drops by 65% following retraction. Citations decline by 50% during the first year post-retraction, and 72% by year 10.

Compared to the control group, retracted papers were more likely to be highly-cited in their first year. They were also more likely to be authored by researchers located at top US universities. Furman considers various speculations on why this might be the case — research produced by scientists at top universities tend to receive both a higher number of early citations and more intense scrutiny; the pressure to retract false papers is much greater at prestigious US universities; and lastly, articles that are later retracted may simply attract considerable debate about their veracity soon after they are published. Whichever precursors lead to article retraction, the effects on the citation record are clear:

[W]hen false knowledge is identified and signaled to the community via a retraction, the signal is critical and leads to an immediate and long-lived decline in citations … [This main finding] provides compelling evidence that the system of retractions is an important mode of governance, which alerts the scientific community to the presence of false knowledge, helping investigators of all varieties to avoid follow-on research predicated on false science, potentially saving tens of millions of dollars per year.

Before concluding that the present system for issuing and communicating retractions is working sufficiently well for science, industry, and the public, we should acknowledge the system’s great failings: Many publishers are reluctant to issue retractions without the author’s permission, lack the resources to properly investigate or defend allegations of misconduct, routinely issue ambiguous retraction notices (or none at all), and adhere poorly to established ethical guidelines. Improving any of these parts, even just a little, would greatly improve the self-correcting nature of science.

Phil Davis

Phil Davis

Phil Davis is a publishing consultant specializing in the statistical analysis of citation, readership, publication and survey data. He has a Ph.D. in science communication from Cornell University (2010), extensive experience as a science librarian (1995-2006) and was trained as a life scientist.


15 Thoughts on "Can Article Retractions Correct the Scientific Record?"

Phil, the “great failings” of the retraction system that you cite in your final paragraph (publishers’ unwillingness to issue retractions without authors’ permission, their lack of resources for investigation, the ambiguity of their retraction notices, their poor adherence to ethical guidelines) don’t sound like failings of the retraction system itself, but rather failings on the part of those who should be using it. If I refuse to drive my car or if I drive it badly or can’t afford to buy insurance, is it the car that has failed?

You’re right in that those problems are not necessarily indications of a failure of the process. However, just like your example, they could be, given enough public consistency. If many drivers drive a certain car less, or drive it poorly, or can’t afford to buy insurance for it, the common variable, the car, may be cause. In the case of social mechanisms, if we identify a number of situations that we would want the mechanism to handle, and it’s not handling them, I would consider that a failing. Or at least a sign that improvement is warranted.

Matthew, that’s a good point — except that for it to apply in this case, you’d have to be arguing that publishers are reluctant to retract because they don’t think a retraction will actually work (i.e., correct the record). The reasons that Phil cites in his final paragraph — and which he labels as “failings of the system” — don’t seem to have anything to do with concerns on the part of publishers about whether retraction is effective, but rather about the risk of offending authors and a simple unwillingness to do the work (or to do it in good faith).

I’m not sure where that follows. Certainly, you may not be driving your car because you think it will break down on the way (i.e. it won’t actually work) but that’s not the only reason: maybe you’re embarassed to drive your pink car, or nervous getting behind the wheel of your big SUV. There are plenty of reasons.

In the case of the current method by which we correct the scientific record, the fact of the “failings on the part of those who should be using it” is an indication that the method is not working the way we want it to. If I write an app for the iPhone but nobody downloads it, I could sit back and say that there’s nothing wrong with the app, it’s just that the people out there don’t realize how awesome it is.

Those that do not want to offend authors should not be responsible for deciding whether or not to offend them. Just because some do it to our satisfaction doesn’t mean the system works.

If I write an app for the iPhone but nobody downloads it, I could sit back and say that there’s nothing wrong with the app, it’s just that the people out there don’t realize how awesome it is.

And you may well be right. You may be looking at a failure of marketing, despite the fact that the app is perfectly good. Saying “this app doesn’t work because people aren’t downloading it” doesn’t make sense. The peer review system is the app we’re talking about, and the reasons Phil cites for publishers’ reluctance to use it have nothing to do with that system’s effectiveness at correcting the scholarly record. They all have to do with publishers’ reluctance to take on the costs of correcting the scholarly record. It’s like a plumber refusing to use a wrench not because wrenches don’t work, but because he doesn’t want to do the job at hand.

It would be interesting to get a sense of the citations that retracted papers attract post-retraction. What percentage of them are negative, in the sense that the paper is cited in the process of disproving its conclusions or correcting them? It’s probably not a study that can be automated, so may be difficult to perform at scale.

John Budd has done extensive content analysis of citations. Below is an excerpt from his 2011 ACRL conference presentation:

The clear majority of citations, 193 (78%), are mentions of the retracted papers as parts of literature reviews or other background sections of the articles. These mentions are tacitly positive; that is, they imply that the retracted articles represent valid work. Perhaps of special note are the 40 citations (16%) that make substantive mention of the retracted papers. These mentions tend to occur in the methodology, findings, or discussion sections of citing papers. They describe the retracted articles favorably and, at times, indicate that the retracted papers provided bases for the later work—all without acknowledging the retracted nature of the article.

In 2005, 67 articles were retracted. Post-retraction, the 67 retracted articles received 965 citations, with one receiving 126. Of the 144 citing articles examined, only 8 (6%) acknowledged the retraction.

Budd JM, Coble ZC, Anderson KM. 2011. Retracted Publications in Biomedicine: Cause for Concern. ACRL Conference, Philadelphia, PA: 390-5.

Thanks Phil. It sounds like there’s some subtlety in there, a retracted paper may still have a valid methods section worth citing, or data points that are indisputable despite other areas of the paper that are problematic enough to warrant retracting. But the big picture is that retracting an article doesn’t pull it off the table as much as we’d expect or desire.

I wonder how much of this is an offshoot of reliance on pdf files, rather than online versions of articles? Many researchers deal exclusively in downloaded pdf’s, never returning to the online version of the paper, never even realizing that the paper has been retracted. Will this behavior and the resultant citation rates change as we move further into the electronic era?

Coincidentally, I have a paper in press that deals with this issue. In my study, I searched for publicly-available PDFs on non-publisher websites and found a considerable number sitting on faculty and lab websites, in library repositories, and on commercial sites promoting medical interventions (supplements, surgery, etc.). We also created an API that searched for these articles residing in personal Mendeley libraries. Once an article is downloaded, there is very little incentive to go back and check on the status of that article. CrossMark attempts to solve this problem by linking the PDF and HTML versions to metadata on the status of the article. A reader would only have to click on the CrossMark logo to get the most up-to-date info on that article. There is much more that can be done to put an end to the persistence of error in the scientific record.

In my experience (as an academic librarian), the reason the global academic scrutiny works well is the same reason that professors use PDFs — researchers trust information that comes from colleagues even more than they trust a database or publisher.

Just as researchers pass articles around instead of finding them on their own, they also pass around the information that a particular study has been discredited. Their social networks are *very* strong. Most of them trust their colleagues more than they trust the publication record.

That seems a bit crazy to those of us who make a living out of searching databases but we are not part of this particular social network. When you are less than 6 degrees removed from the foremost expert on a topic, searching a database is the long way around.

If someone really wanted to address the problem, they would write an email plug-in that automatically detects when a citation or PDF is being sent and then dynamically checks it against a database of retracted articles.

You forward a citation to your friend and ping! you get back an email from the server saying “Did you know this study has been retracted? Please read here (link to retraction).” Both science and the ego of the sender are saved.

Scientists are never going to search like librarians. A very small number of articles are retracted every year. Asking them to change the way they do their day to day communication for the sake of that small percentage is not reasonable.

It’s great that I can search DIALOG on the command line but that is very different than having a finely tuned, carefully vetted, social and professional network of like-minded peers. Build a system that works *with* that network instead of against it and you’re golden.


Thanks so much. I agree fully that this is a very nuanced issue. In the paper that Phil kindly references, we examined a subset of the papers that cited retracted work and found a mix of reasons for their citations. Some cited retracted articles in full recognition of their errors — those papers either noted explicitly that the retracted articles were inaccurate or implicitly acknowledge their being false by citing both the false article and the article that retracted it. Sadly, some articles cited retracted works as if they were correct, citing faithfully the key result(s) that had been disavowed in the retraction statement. A third category of citations to retracted articles includes those that cite either to identify an area of research interest (e.g., citing Hwang’s retracted human embryonic stem papers in order to note the importance of research on human embryonic stem cell research) or to identify an research approach that had been pursued in prior research. The perniciousness of this third category of citations is much more difficult to judge. As you note, classifying such citations is time-consuming and subject to rater biases, so we try to be cautious in interpreting the results of this aspect of our study. We do see this as an interesting area for future exploration.

Good point, Jeff. Many citations are perfunctory in nature and acknowledge that similar work has been done in the field. These are often found in the Introduction and involve long strings of citations, i.e. [5-28]. However, when these citations are aggregated and summed into a citation metric, they can elevate the prestige of an article and construct an authority that the article does not deserve.

A similar study was published in 2011 by Neale et al. who report no significant decline in article citations post-retraction compared to a control group [1]. Since you analyze almost identical sets of retracted papers, why is it that your paper reports large effects (65% post-retraction decline) while Neale et al. reports no difference?

[1] Neale AV, Dailey RK, Abrams J. 2010. Analysis of Citations to Biomedical Articles Affected by Scientific Misconduct. Science and Engineering Ethics 16: 251-61.


Thanks so much — this is a very interesting issue and thank you for altering me to this article. I should first note that I regret that we had not come across Neale et al.’s work while we were writing our paper. Their efforts in characterizing the reasons that papers continue to cite retracted articles are more extensive, systematic, and sophisticated than our own and I think that they deserve significant attention in future work. Their findings, particularly in Table 2, raise red flags about the reasons for continued citation of retracted articles.

With respect to the impact of retraction on future citations, I think that the difference in our reported findings is that Neale et al. report the total number of citations received by retracted articles and control articles in the years following retraction. By contrast, we focus on the difference in citation rates before and after retraction for retracted articles. One of the key aspects of our econometric approach is that we control for pre-retraction rates of citation, which are higher in the ultimately-retracted articles than in the control articles. Thus, our analysis focuses on the fact that the rate of citation to retracted articles falls significantly relative to what would have happened had the articles not been retracted. (We identify this using article-specific fixed effects and post-retraction effects.) If we were to report the raw count of citations to retracted articles and controls, we would probably find, like Neale et al., that the retracted articles have a similar number of post-retraction citations, even though the rate of citations would have fallen substantially following their retraction.

I think that both approaches highlight something interesting: On one hand, it is comforting that retraction has a significant and long-lived negative impact on future citations; on the other hand, it a cause for concern that ultimately-retracted articles attract so much attention prior to the discovery of their errors and that they continue to attract at least some supportive citations after their retraction.

Comments are closed.