Does open access (OA) distribution of scholarly articles lead to an increase in citations? A new study that controls for quality of articles says it doesn’t.
The subject is a controversial one that raises some disturbing questions about the validity of citation as a metric, about advocacy, and about the ethics of gaming the system.
Many studies have been undertaken to answer whether the immediate free availability of a paper results in a higher number of citations. Results of these studies vary wildly, and the methodologies used are often disputed. Last month saw the release of a new study of legal scholarship, which was touted in a widely read blog to report that:
. . . legal scholars who publish in open access (free and freely copyable) journals are 50 percent more likely to be cited in subsequent papers than those who publish in traditional journals, which can be very expensive.
This interpretation is problematic, as the study shows nothing of the sort.
The study in question looks at papers in three print-only University of Georgia law journals. The authors searched using Google to find which articles had gone beyond their print-only manifestation and become freely available online. They then compared the citation rates for those papers to those for papers confined only to print. Not surprisingly, those available online were more highly cited.
The authors’ conclusion that “open access legal scholarship . . . can expect to receive 58% more citations than non-open access writings of similar age from the same venue” must be questioned. Did they really compare OA articles with non-OA articles under similar conditions? Or did they in fact find a citation advantage for articles that are available in any fashion online versus those that are not?
Experimental design is a complicated task (David Glass’ book “Experimental Design for Biologists” is an excellent primer for those interested). It’s important to create experiments with fair comparisons and to use proper controls.
In the law journal study, unequal comparison groups makes it unclear whether the authors are measuring access or something else entirely. The study also falls prey to potential issues of selection bias. Perhaps scholars selected the highest quality articles from the UGA law journals and put them up online, not bothering with inferior or outdated articles. Could this difference in quality account for the difference in citation seen?
The Scholarly Kitchen’s own Phil Davis (consider this your conflict-of-interest statement) has published a new study that attempts to do away with selection bias by looking at a randomized set of papers. The access status of the articles in the study (OA or under subscription-access control) was determined at random, not by author or editorial choice. The randomization is important here, as it allows Davis to compare equal groups of articles and control for other sources of bias. The study showed that OA articles were “cited no more frequently, nor earlier, than subscription-access control articles within 3 years.”
Davis’ study is not without its critics, but I am less interested in debating the efficacy of his statistical methods than in looking at what this study clearly does show, and the overall notion of what we consider as “impact.”
Davis’ study shows that OA articles are more widely read than non-OA articles, and this is a significant benefit of OA publishing. If the goal of the OA movement is to create a scientific literature with a broader reach, then it is succeeding admirably.
There is an argument to be made that an article’s impact increases when it is more widely read. I have a difficult time accepting this argument, as it’s important to separate popularity from quality. As we know from popular culture, the most-read books, the most-viewed movies, and the most-heard music are often not the pinnacles of our culture’s creative endeavors, nor are they automatically the most influential on future creations.
Download statistics measure the level of interest in a given article. They tell you nothing of the reader’s reaction to the article (nor if they even actually read it). As Davis points out, the increased readership for OA articles may not be coming from within the research community:
The real beneficiaries of open access may not be the research community, which traditionally has excellent access to the scientific literature, but communities of practice that consume, but rarely contribute to, the corpus of literature. These communities may include students, educators, physicians, patients, government, and industry researchers.
The reason that citation remains a useful metric of impact is that it generally requires (outside of a review article) a real contribution to research by the cited paper — i.e., we were able to do experiment B because of the work done in the cited experiment A. It is an imperfect metric, to be sure, and subject to bias, but it comes with a high barrier to entry, making it less open to gaming than other suggested measurements.
If downloads were the primary measurement for impact, then the much derided papers featured in splashy media events like the arsenic-based life article or the paper on the Darwinius fossil would be seen as the most important of recent years. Those researchers able to afford advertising campaigns would rise to prominence rather than those doing the best work.
This sort of corruption of metrics — ways to create a perception of quality where it may not exist — are troubling. It is perhaps why I find suggestions that authors can purchase their way to increased citations by paying author fees deeply disturbing. If there is indeed a citation advantage to open access publishing, is it acceptable for publishers to try to rake in more fees by inducing authors to game the system this way? Does this potential pathway toward buying status favor the well-funded laboratory, allowing the rich to become richer?
If an alleged citation advantage is behind an author’s motivation to pay fees, then what happens if all journals become full OA publications and the playing field is leveled? Without the conferred advantage, will authors continue to accept the new economic conditions?
The fact that these sorts of inducements to game the system appear regularly, and through official forums, shines a negative light on the way advocacy sometimes takes on a life of its own.
Last October, I wrote about the circular nature of proposed impact measurements. Through overzealous advocacy, systems can reach a point where the means become an end unto themselves and the measurement becomes more important than the thing it’s trying to measure.
A recent FriendFeed post shows another example of this — a suggestion that one should rate an article in order to support the article rating system itself, rather than because one has an informed opinion on its content that would further the discussion. All these attempts to artificially drive adoption lose sight of the real goals behind the advocacy, and ultimately take away meaning from the causes they’re trying to support.
Davis’ study provides strong evidence that citation is still a reasonable metric for impact, and that it doesn’t directly correlate with popularity or accessibility.
The content of the paper still seems to matter more than how many times it’s read (or at least downloaded), and the study suggests that impact can not be purchased through author fees.
Perhaps more importantly, Davis’ study does show that OA is succeeding at removing barriers to access, and letting research literature reach a wider audience. That’s an important accomplishment, and reason enough to argue the case for OA publishing, rather than selling it as a cynical means to get an advantage over one’s competition. Any long term change to scholarly research needs to be able to stand on its own merits. Questionable promises that border on gaming the system muddy the water, and take away from the actual goals being served.