What do you do as a scientist when you really, really want something to be true and yet the evidence just isn’t quite there? If you stay true to the scientific method, you only make conclusions that the data supports, very carefully note the limits of your study and take great care not to make claims beyond those limits. Unfortunately, far too often, authors overstate the significance of their work.
One of the key responsibilities of journal editors, as well as peer reviewers, is to hold authors to a high standard. Authors should not be allowed to state any conclusion that is not fully borne out by the data. Speculation is fine, as long as it is clearly stated as such. As a former journal editor-in-chief (and frequent peer reviewer), this is probably the most common problem I saw in submitted manuscripts. The authors have a theory as to how their system works, and while the described research supports that theory, it does not prove it inconclusively. This does not, however, prevent the authors from claiming that it does. A good editor/reviewer walks back these claims to an appropriate level. Sometimes this means changing words like “necessary” to “sufficient”, or “proves” to “suggests”.
This is particularly important for observational studies. Unlike experimental studies, observational studies tell us more about correlation than causation. Unless great care is taken to eliminate all confounding factors, the conclusions from an observational study must be very carefully phrased to avoid overstating their significance. X correlates with Y is often a better-supported conclusion than X is responsible for Y.
This seems particularly problematic for bibliometric studies, especially those done on the question of an open access (OA) citation advantage. There have been, and continue to regularly be, observational studies done, looking to see whether open access to a research paper results in that paper receiving more citations than it might have if it were published under a traditional subscription model. SPARC Europe lists 70 such studies, the majority of which claim to show a citation advantage.
The problem with looking at raw numbers of studies is that, of course, popularity isn’t what matters in science; rather what matters is rigorous methodology and accuracy of conclusions. One well-done study with the right controls can overturn the conclusions of thousands of previous studies. That’s how science works.
The studies that claim an OA citation advantage are observational, generally comparing the performance of one set of articles to another, usually OA articles in a journal versus subscription articles in the same journal. At best, this can show you a correlation — the OA articles correlate with higher levels of citations (just as higher ice cream sales correlate with higher levels of murder). But nothing can be gleaned as to causation because there are too many confounding variables that come into play. The two sets of papers being compared must be as identical as possible, with the exception of the variable being tested (access). But for Gold OA studies, common sense tells us that authors with money to pay for OA may come from better-funded labs, and so the citation advantage perhaps then stems from superior funding levels. Similarly, many authors state that they save their OA funds and only use them only on their top papers, not paying for OA on their more average or lower quality outputs. So the citation advantage may just as likely be due to selection done by authors.
To truly test causation, one must perform a randomized, controlled trial, and when one does that, the direct connection between OA and increased citations does not appear to be reproducible. In an era where the reproducibility of published research is increasingly questioned, these studies should receive increased skepticism because of this failure. The problem gets even more pronounced when advocacy and lobbying groups or commercial companies loudly declare an unclear correlation to be an established causal fact.
This has been discussed and discredited ad nauseam (see here, here, here, here, and here on this blog alone), and yet I continue to regularly field questions about it from journal editors. It’s a subject that simply refuses to die, likely because many people really want it to be so, regardless of what the evidence says.
And so, more and more researchers, hoping to find an OA citation advantage, repeat the same mistakes. Either those pursuing these studies lack sufficient training in experimental design and the use of controls (this book is still my go-to for explaining necessary controls), or this is a case of advocacy overwhelming analysis. For example, in the last few weeks, two studies claiming citation advantage have been released.
The first comes from the companies 1Science and Science-Metrix. They perform a typical observational study, see a correlation and declare that, “a citation advantage exists for OA papers.” They note previous objections about selection bias in studies such as theirs and apparently decide to ignore it. The authors acknowledge that it may be a major factor, yet do not temper their conclusion in any way (other than suggesting that a future study will rule out selection bias as a factor). To be fair though, this is not a formally published scientific study. It has not been peer reviewed, and given the obvious conflicts of interest from the authors, whose livelihood depends on selling OA-related products and services, it can be dismissed as commercial marketing, rather than objective analysis.
More problematic is a formally published paper that passed through the peer review and editorial process. In this paper, the author attempts to deal with selection bias, but rather than eliminating it, he ends up replacing the typical selection bias seen for such studies with a different flavor of selection bias. The study claims to use “the equivalent of a random sample” of articles by taking its experimental population from the University of Michigan (UM) repository, Deep Blue. These are compared to articles from the same journal issues that are not found in the repository.
But this is far from a “random” sample. The study is essentially comparing articles by authors from UM to those by non-UM authors. UM is rated one of the top 30 universities in the US, and has been declared the top “Public Research University in the US” by the National Science Foundation. Perhaps there’s a citation advantage to being a UM researcher? Further, there’s no evidence presented that their control group of “closed access/subscription” articles are indeed “closed”. The author apparently assumes that if an article is not freely available in the Deep Blue repository, it is not freely available anywhere. The data set available has been “anonymized” and all of the articles de-identified. There is no way to use it to confirm the results, nor to check whether “closed articles” are indeed “closed”. It is thus impossible to know if the study compares OA articles to subscription articles, or OA articles to other OA articles in different repositories (or freely available in the journal itself, as many journals make articles free after some embargo period). The citation advantage seen may be due to the efficacy of Deep Blue as a repository versus other, less discoverable outlets, rather than a direct effect of OA itself.
While the author does hedge his conclusions (even including a “probably” in the title), his claim that self-selection bias has been removed is negated by the myriad other confounding factors the experimental design has introduced. Without adequate controls, there is no way to tell from the study the causation of any effect seen. To do that, the gold standard is a randomized, controlled trial, but these are much harder to do than observational studies.
And so we have a literature that largely consists of lesser-quality correlative studies that are contradicted by the small number of those with a more rigorous experimental design. There may indeed be an OA citation advantage, but until there is reproducible experimental evidence that it exists, it remains, at best, speculation. Performing more and more correlative studies without proper controls will still not be adequate to prove causation. In science, quantity should not trump quality.
The likely next battleground for this type of wishful thinking is the impact that social media plays on readership and citation of scholarly articles. Initial observational studies and anecdotes suggest a correlation between a paper appearing in various types of social media and increased readership and citation. Yet two randomized, controlled trials done by the journals Circulation and International Journal of Public Health showed no impact of social media activity whatsoever on readership (both studies) or citation (the latter study). Both studies are fairly limited, but the authors do a good job of carefully stating their conclusions — neither makes a blanket statement and both suggest the subject to be complex, rather than social media being without value or providing an automatic boost across the board.
Like the OA citation advantage studies, there are conflicting agendas that will come into play here, with both advocacy and financial gain potential factors in the conclusions one draws.
There remains something distasteful, if not downright unethical in suggestions that authors can buy a better reputation by paying for services, whether OA or promotional. Because of this, we must not let down our standards for scientific rigor, even if we really, really want something to be true. Bibliometricians, if you’re going to work in this area, you need to step up your game, and journal editors, you must hold them to the same high standards to which you hold other authors.
Remember, “I’ll believe it when I see it” is science. “I’ll see it when I believe it,” is religion.