According to a study of 11 biological and medical journals that allow authors the choice of making their articles freely available from the publisher’s website, few show any evidence of a citation advantage. For those that do, the effect appears to be diminishing over time.

The study, “Author-choice open access publishing in the biological and medical literature,” (a copy of the final manuscript is also available from the arXiv) analyzed over eleven thousand articles published in journals since 2003, sixteen hundred of these articles (15%) adopting the author-choice open access model. Oxford University Press journals make up 6 of the 11 journals analyzed in the study.

Since open access publication fees can amount to several thousands of dollars, the author (ok that’s me) goes on to determine the cost-benefit for each additional citation. While the cost-per-additional-citation was as low as about $400 (for PNAS), it approached $9,000 for the journal Development, published by the Company of Biologists.

Considering the evidence that author-choice open access publishing may have little effect on article citations, it is worthwhile for authors to consider the cost of this form of publishing. If a citation advantage is the key motivation for authors to pay open access fees, then the cost/benefit of this decision can be quite expensive for some journals.

This study is a follow-up to a controlled trial of open access publishing published in BMJ where articles were randomly selected for open access or traditional subscription-access. The authors reported no difference in citations in the first year. Since the current study observes the effect of author-choice open access, self-selection may play a role in explaining the results.

As the author of this study, I hope that readers do not come away with the feeling that I’m advocating against an author-choice program. There may be many benefits to making scientific results freely available; however, scientists should understand that open access may not buy them more citations.

Free dissemination of the scientific literature may speed up the transfer of knowledge to industry, enable scientists in poor and developing countries to access more information, and empower the general public. There are clearly many benefits to making one’s research findings freely available to the general public – a citation advantage may not be one of them.

Reblog this post [with Zemanta]
Phil Davis

Phil Davis

Phil Davis is a publishing consultant specializing in the statistical analysis of citation, readership, publication and survey data. He has a Ph.D. in science communication from Cornell University (2010), extensive experience as a science librarian (1995-2006) and was trained as a life scientist. https://phil-davis.com/

Discussion

5 Thoughts on "Author-choice Open Access Publishing"

Confirmation Bias and the Open Access Advantage:
Some Methodological Suggestions for Davis’s Citation Study

Full text: http://openaccess.eprints.org/index.php?/archives/451-guid.html

SUMMARY: Davis [2008] — http://arxiv.org/pdf/0808.2428v1 — analyzes citations from 2004-2007 in 11 biomedical journals. For 1,600 of the 11,000 articles (15%), their authors paid the publisher to make them Open Access (OA). The outcome, confirming previous studies (on both paid and unpaid OA), is a significant OA citation advantage, but a small one (21%, 4% of it correlated with other article variables such as number of authors, references, and pages). The author infers that the size of the OA advantage in this biomedical sample has been shrinking annually from 2004-2007, but the data suggest the opposite. In order to draw valid conclusions from these data, the following five further analyses are necessary:

(1) The current analysis is based only on author-choice (paid) OA. Free OA self-archiving needs to be taken into account too, for the same journals and years, rather than being counted as non-OA, as in the current analysis.
(2) The proportion of OA articles per journal per year needs to be reported and taken into account.
(3) Estimates of journal and article quality and citability in the form of the Journal Impact Factor and the relation between the size of the OA Advantage and journal as well as article “citation-bracket” need to be taken into account.
(4) The sample-size for the highest-impact, largest-sample journal analyzed, PNAS, is restricted and is excluded from some of the analyses. An analysis of the full PNAS dataset is needed, for the entire 2004-2007 period.
(5) The analysis of the interaction between OA and time, 2004-2007, is based on retrospective data from a June 2008 total cumulative citation count. The analysis needs to be redone taking into account the dates of both the cited articles and the citing articles, otherwise article-age effects and any other real-time effects from 2004-2008 are confounded.

The author proposes that an author self-selection bias for providing OA to higher-quality articles (the Quality Bias, QB) is the primary cause of the observed OA Advantage, but this study does not test or show anything at all about the causal role of QB (or of any of the other potential causal factors, such as Accessibility Advantage, AA, Competitive Advantage, CA, Download Advantage, DA, Early Advantage, EA, and Quality Advantage, QA). The author also suggests that paid OA is not worth the cost, per extra citation. This is probably true, but with OA self-archiving, both the OA and the extra citations are free.

Stevan,
Our study focuses on estimating the effect of author-choice open access on article citations. The 11 journals were selected because they gathered sufficient paying open access submissions as to make a statistical analysis even potentially possible. Still, if the open access effect is small, a larger sample size is required to detect a signal amongst the noise, which is why I aggregated the 11 journals for subsequent analyses. PNAS contributed so many articles in the aggregate dataset (about a third) that I didn’t want this one journal to dominate the results, hence the tables report the analyses with and without PNAS.

Secondly, while aggregating the journals resulted in increased statistical power, we are combining articles published in different scientific fields (biology, medicine, bioinformatics, plant sciences, and multi-disciplinary sciences), which is why journal impact factors are not used as an explanatory variable. Please note that I did include the variable Journal as either a random variable (Table 2) or a fixed variable (Table S2), so journal-to-journal variation is being accounted for in the model.

RE: Harnad point #4

Because of the sheer number of articles published by PNAS, tracking the performance of each article was considered too onerous. As a result, I tracked the first and last 6-month cohort of articles (June-Dec 2004; and June-Dec 2006). By choosing the first and last cohort, I could estimate a temporal trend in the data. Please remember that PNAS was only one of the 11 journals analyzed in this study, and that Gunter Eysenbach’s study (PLoS Biology, 2006) analyzed only a 6-month cohort in one journal (PNAS, June-Dec, 2006). Granted, a full dataset from PNAS would have been ideal, and I encourage Prof. Harnad to gather and share the intervening years if he feels that the missing data points would change significantly the results of this study. My sense is that they won’t, but will challenge Prof. Harnad to prove me wrong.

I think the scientific community may be slightly different then most other industries when it comes to an exchange of information. In most industries the life cycle’s are such that even if a competitor finds your research immediately they are years behind. In some scientific studies we need to make sure the researchers and developers of new technology are adequately compensated for their break throughs.

Comments are closed.