Does open access (OA) distribution of scholarly articles lead to an increase in citations? A new study that controls for quality of articles says it doesn’t.
The subject is a controversial one that raises some disturbing questions about the validity of citation as a metric, about advocacy, and about the ethics of gaming the system.
Many studies have been undertaken to answer whether the immediate free availability of a paper results in a higher number of citations. Results of these studies vary wildly, and the methodologies used are often disputed. Last month saw the release of a new study of legal scholarship, which was touted in a widely read blog to report that:
. . . legal scholars who publish in open access (free and freely copyable) journals are 50 percent more likely to be cited in subsequent papers than those who publish in traditional journals, which can be very expensive.
This interpretation is problematic, as the study shows nothing of the sort.
The study in question looks at papers in three print-only University of Georgia law journals. The authors searched using Google to find which articles had gone beyond their print-only manifestation and become freely available online. They then compared the citation rates for those papers to those for papers confined only to print. Not surprisingly, those available online were more highly cited.
The authors’ conclusion that “open access legal scholarship . . . can expect to receive 58% more citations than non-open access writings of similar age from the same venue” must be questioned. Did they really compare OA articles with non-OA articles under similar conditions? Or did they in fact find a citation advantage for articles that are available in any fashion online versus those that are not?
Experimental design is a complicated task (David Glass’ book “Experimental Design for Biologists” is an excellent primer for those interested). It’s important to create experiments with fair comparisons and to use proper controls.
In the law journal study, unequal comparison groups makes it unclear whether the authors are measuring access or something else entirely. The study also falls prey to potential issues of selection bias. Perhaps scholars selected the highest quality articles from the UGA law journals and put them up online, not bothering with inferior or outdated articles. Could this difference in quality account for the difference in citation seen?
The Scholarly Kitchen’s own Phil Davis (consider this your conflict-of-interest statement) has published a new study that attempts to do away with selection bias by looking at a randomized set of papers. The access status of the articles in the study (OA or under subscription-access control) was determined at random, not by author or editorial choice. The randomization is important here, as it allows Davis to compare equal groups of articles and control for other sources of bias. The study showed that OA articles were “cited no more frequently, nor earlier, than subscription-access control articles within 3 years.”
Davis’ study is not without its critics, but I am less interested in debating the efficacy of his statistical methods than in looking at what this study clearly does show, and the overall notion of what we consider as “impact.”
Davis’ study shows that OA articles are more widely read than non-OA articles, and this is a significant benefit of OA publishing. If the goal of the OA movement is to create a scientific literature with a broader reach, then it is succeeding admirably.
There is an argument to be made that an article’s impact increases when it is more widely read. I have a difficult time accepting this argument, as it’s important to separate popularity from quality. As we know from popular culture, the most-read books, the most-viewed movies, and the most-heard music are often not the pinnacles of our culture’s creative endeavors, nor are they automatically the most influential on future creations.
Download statistics measure the level of interest in a given article. They tell you nothing of the reader’s reaction to the article (nor if they even actually read it). As Davis points out, the increased readership for OA articles may not be coming from within the research community:
The real beneficiaries of open access may not be the research community, which traditionally has excellent access to the scientific literature, but communities of practice that consume, but rarely contribute to, the corpus of literature. These communities may include students, educators, physicians, patients, government, and industry researchers.
The reason that citation remains a useful metric of impact is that it generally requires (outside of a review article) a real contribution to research by the cited paper — i.e., we were able to do experiment B because of the work done in the cited experiment A. It is an imperfect metric, to be sure, and subject to bias, but it comes with a high barrier to entry, making it less open to gaming than other suggested measurements.
If downloads were the primary measurement for impact, then the much derided papers featured in splashy media events like the arsenic-based life article or the paper on the Darwinius fossil would be seen as the most important of recent years. Those researchers able to afford advertising campaigns would rise to prominence rather than those doing the best work.
This sort of corruption of metrics — ways to create a perception of quality where it may not exist — are troubling. It is perhaps why I find suggestions that authors can purchase their way to increased citations by paying author fees deeply disturbing. If there is indeed a citation advantage to open access publishing, is it acceptable for publishers to try to rake in more fees by inducing authors to game the system this way? Does this potential pathway toward buying status favor the well-funded laboratory, allowing the rich to become richer?
If an alleged citation advantage is behind an author’s motivation to pay fees, then what happens if all journals become full OA publications and the playing field is leveled? Without the conferred advantage, will authors continue to accept the new economic conditions?
The fact that these sorts of inducements to game the system appear regularly, and through official forums, shines a negative light on the way advocacy sometimes takes on a life of its own.
Last October, I wrote about the circular nature of proposed impact measurements. Through overzealous advocacy, systems can reach a point where the means become an end unto themselves and the measurement becomes more important than the thing it’s trying to measure.
A recent FriendFeed post shows another example of this — a suggestion that one should rate an article in order to support the article rating system itself, rather than because one has an informed opinion on its content that would further the discussion. All these attempts to artificially drive adoption lose sight of the real goals behind the advocacy, and ultimately take away meaning from the causes they’re trying to support.
Davis’ study provides strong evidence that citation is still a reasonable metric for impact, and that it doesn’t directly correlate with popularity or accessibility.
The content of the paper still seems to matter more than how many times it’s read (or at least downloaded), and the study suggests that impact can not be purchased through author fees.
Perhaps more importantly, Davis’ study does show that OA is succeeding at removing barriers to access, and letting research literature reach a wider audience. That’s an important accomplishment, and reason enough to argue the case for OA publishing, rather than selling it as a cynical means to get an advantage over one’s competition. Any long term change to scholarly research needs to be able to stand on its own merits. Questionable promises that border on gaming the system muddy the water, and take away from the actual goals being served.
10 Thoughts on "Gaming the System: Do Promises of Citation Advantage Go Too Far?"
The FriendFeed reference you make is very disturbing:
Steve Koch to Steve’s feed, Science 2.0, The Life Scientists, PLoS ONE in the Media/Blogs
Bora, Abizar, and I have rated this cool #PlosONE article. It’s been downloaded 26,000 times. It’s make or break time for article ratings.
First, Steve’s purpose appears to support PLoS’s rating system itself, and not what it was designed to do — to evaluate the quality of articles.
Second, one of the raters (Bora Z.) was the online community manager of PLoS ONE.
Last, the senior author of the paper was Jonathan A. Eisen, who is also the academic editor of PLoS ONE.
Most journals have policies in place to prevent overt conflicts of interest. I don’t know how or why this was allowed to happen. Can anyone at PLoS explain?
Phil Davis’s comment illustrates two of the problems with download numbers as proper measurements of anything (“voting” and encouragement to download by interested parties). Coming up with an objective measure is challenging, which is one reason why citation has persisted for so long, despite its own flaws as a metric (and, indeed, the inappropriate use to which some people put the metric).
Many journals and other reputable websites use COUNTER, an independent organisation that authenticates their claimed downloads. However, this is mainly focused on eliminating “robot” web crawlers/spammers, or omitting to count multi-downloads by one source, and not people who are “gaming” the system by telling all their friends and relations to download the paper sight unseen.
If one writes an Amazon review of a book, Amazon’s system prevents you from ranking the review. Perhaps online voting on articles needs a similar check/balance, preventing authors of the article and associates (the latter would be hard) from voting. Such a scheme would, I think, require the potential voter to register or provide an ID, which he or she may not want to do.
Anyway, for now, it is worth only taking any claim of numbers of downloads or “hits”/”page views” seriously if that site is “COUNTER compliant”, which can be quickly assessed via a kitemark on the site concerned.
Phil and Maxine–
I think both of your comments speak to the “social” nature of social metrics. In many ways, they’re based more on networking than they are on objective measures of quality, and that’s why I often find them problematic. The author who has lots of friends in science is more likely to get lots of positive comments and ratings on his paper than the snarling unfriendly author who doesn’t schmooze as well or write a funny blog with pictures of cats. Research merit should not be measured by how neighborly one is.
The conflict of interest question Phil raises can perhaps be explained away by the open and social nature of the review system implemented. There is little avenue for the publishers to intervene and police for any conflicting activities, as the system is deliberately hands-off by design. Often such social systems, over time, become self-policing. However, when there are careers and funding at stake, social rating systems are generally filled with shills and astroturfers (examples here and here from last week alone).
The high barrier to entry is likely why citation has remained as the dominant means of post-publication peer review. It’s much harder to game the ratings when you have to publish a new piece of research in order to do so.
I’m fond of using the examples of cold fusion and “The Bell Curve” (1994) as instances where controversy probably created a lot of citations during the height of the debates but where quantity of citations would surely be a misleading indicator of quality of research. A more sophisticated system of using citation as a metric for quality would weight the places where citations occurred, so that citations in higher quality journals would count for more than those in lower quality journals. Has this ever been tried?
Economists Mark McCabe and Chris Snyder also question the value of a citation when it can be bought with additional convenience. From their article, reviewed a few weeks ago on the Scholarly Kitchen, they write:
If a small change in the convenience of access can cause a quadrupling of citations, then the typical citation may be of marginal value, used to pad the reference section of citing articles rather than providing an essential foundation for subsequent research. According to this view, citations would be at best a devalued currency, subject to manipulation through the choice of publication outlet. On the other hand, the finding of little or no citation boost would resuscitate the view of citations as a valuable currency and as a useful indicator of an article’s contribution to knowledge.