Image via mpclemens

It’s often said that history is written by the victors.

In the retelling of history, a simple storyline is constructed that reinterprets past events into a coherent narrative.  When facts don’t fit into that narrative, they are edited, reinterpreted, or simply torn out.

Individuals engaged in battles of ideology also have a tendency to revise and reinterpret the past. Old arguments mysteriously disappear from web pages or get revised. Those who are challenged on past remarks respond with a claim of simple misunderstanding or that their words were taken out of context. What is at stake here are not lives but egos.

While this kind of reinterpretation of history is not as offensive as say the denial of genocide, it’s still a form of historical revisionism. In the academic realm, it is both disingenuous and distasteful.

The Open Access Citation Advantage: Studies and Results To Date is the latest report from Key Perspectives publishing consultant Alma Swan.  The document is freely available from the University of Southampton’s institutional archive and has been widely promoted on several listservs by open access evangelist Stevan Harnad.

In her report, Swan summarizes the studies posted on Steve Hitchcock’s bibliography of open access studies, and her summary is quite useful for those looking for a distillation of the research literature.  What is troubling with this document, however, is the narrative, and it is here that Swan creates a historical revision of the open access debate.

The argument that open access leads to increased citations (coined the “open access citation advantage” by Stevan Harnad for rhetorical purposes) is not that old.  Most of us remember Steve Lawrence’s 2001 letter to Nature, the debate it ignited, and a flurry of unqualified claims that persist to this day.

“articles that are made open access . . . are cited twice as much as those articles (in the very same journals and years) that are not” (Harnad, 2006)

What’s more, the strength of conviction appears to grow with each iteration of the OA citation advantage claim as each proponent paraphrases from the last:

Study after study has shown that free online access increases the impact of research literature, as measured by citations, 50 percent to 250 percent. (Peter Suber, 2005)

. . . in study after study and discipline after discipline, that open access is associated with increased citations for authors and journals, when compared to similar work that is not open access (John Willinsky, 2006)

Study after study has shown that articles available freely online are more widely cited than those that are available only through publishers’ access-limited venues. (Stuart Shieber, 2009)

Statements such as these have also been the focus of ARL/SPARC advocacy campaigns and are found frequently on university library web pages:

Open access articles are cited more often than those in limited access subscription journals (Alliance for Taxpayer Access, 2004)

Studies have shown that open-access articles are cited at a higher rate than those with restricted access. (Columbia University Libraries)

Given the recorded history of this debate, it would seem strange — even disingenuous — to assert anything but a simple and unqualified connection between access and citations and the implication of causality.  The literature is full of examples of rhetorical prose from influential thinkers of the time, yet Swan appears to disregard the past, promoting a new revision of historical events.  Her document is full of such statements:

There certainly was not, even early on, an expectation amongst the thinkers on this topic that OA can work magic and make the uncitable suddenly citable

That OA would produce an automatic citation boost for every article was never the expectation

Thus a blanket ‘OA boost’ to citations of, say, 50% was never considered probable.

The Limits of Meta Analysis

In promoting Alma Swan’s report, Stevan Harnad asserts that the time is “ripe for a meta-analysis” of existing studies. Indeed, in her report, Swan provides a score sheet for the open access game:

  • Studies finding a positive open access citation advantage = 27
  • Studies finding no open access citation advantage (or an OA citation disadvantage) = 4

If science were a matter of tallying consenting and dissenting views, the world would still be flat, the sun would still be revolving around the Earth, and Thalidomide would still be prescribed as a sedative for pregnant women.

Scientific truth is hardly a game of populism.

Meta-analysis is set of powerful statistical techniques for analyzing the literature.  Its main function is to increase the statistical power of observation by combining separate empirical studies into one über-analysis.  It’s assumed, however, that the studies are comparable (for instance, the same drug given to a random group of patients with multiple myeloma), but conducted at different times in different locales.

This is not the case with the empirical literature on open access and citations.  Most of the studies to date are observational (simply observing the citation performance of two sets of articles), and most of these use no statistical controls to adjust for confounding variables.  Some of the studies have focused on the effect of OA publishing, while others on OA self-archiving.  To date, there is still only one published randomized controlled trial.

Conducting a meta-analysis on this disparate collection of studies is like taking a Veg-O-Matic to a seven-course dinner.  Not only does it homogenize the context (and limitations) of each study into an unseemly mess, but it assumes that homogenization of disparate studies somehow renders a clearer picture of scientific truth.

Nothing could obscure the accurate history of this topic more.Reblog this post [with Zemanta]

Phil Davis

Phil Davis

Phil Davis is a publishing consultant specializing in the statistical analysis of citation, readership, publication and survey data. He has a Ph.D. in science communication from Cornell University (2010), extensive experience as a science librarian (1995-2006) and was trained as a life scientist.


30 Thoughts on "Rewriting the History of the Open Access Debate"

Entertaining writing, well done. But I am rather surprised that you feel entitled to challenge the development of my own expectations. Surely I am allowed to write about them as a matter of fact, since I had them?

So, no ‘rewriting of history’ there, I’m afraid. Just an explanation of how thinking has developed. It’s never been naive and uncritical as you like to portray. Sometime, someone with real historical expertise will write the history of all this in its full and illuminating context. Everything’s there on the Web for them to consult and interpret, including all these debates. It will make a very good read. I certainly was not attempting to be a historian, merely a documenter, but I eagerly await the verdicts of time. I am also looking forward to documenting all the studies still to come that will be measuring OA impact in new ways. There is much of interest ahead.

I shall be putting the table from the article up on the Web somewhere and adding new studies to it as they are published, in order to provide a permanently up-to-date list of them all, along with details of their methodology and findings. The links to the originals are there for follow-up, so that people can study them and make their own minds up about their validity and what they show.

Intelligent people don’t need to be told what they should think about these things, or pointed to particular studies as the only ones that should matter. They can read all of the studies and reach their own conclusions, including about which ones have been carried out in the best way.

The ‘score sheet’ was the only sensible way of summarising the pages-long table. People often ask me how the studies tally up, so tally them up I did.

Thanks for your early reply. I’m not convinced with your argument that your review article is merely a piece of personal exploration and therefore resides outside the domain of scholarly criticism.

By taking this approach, all of science is reduced to personal reflection. It is a tautological argument that gets us no closer to answering the very pragmatic and objective question: does access affect citation behavior?

Moreover, your article has all of the semblances of a staid and serious scientific article. You identify yourself and your affiliation; you provide an abstract; your review follows the tabular format of most scientific review articles; you use references.

This is not a letter to the editor of a British tabloid that begins with the phrase, “I am outraged…”

Indeed, the context of the paper is scientific. It is placed in a university repository and promoted by a university professor. And given your track record of publication in scientific and professional journals, I would expect this review to show up in a peer-reviewed outlet sometime in the future.

Yes, intelligent people can read the reports themselves and make up their own mind. We assume (and perhaps now wrongly so) that you are taking the role of a serious researcher/consultant. If this is not the case, it may be prudent to make your intentions known to public now so we are no longer deceived.

In reading over the methodologies of the research Swan highlights, her paper seems to stem from the same motivation as can be found in the old joke that ends with the punchline, “There must be a pony in here.”

Question: this Wired article makes mention of thalidomide in a similar context, as an exemplar of the dangerous things that can happen when one relies on a quantity of scientific studies that are incorrectly performed. Does that make Wired a “sewer” as well?

Unfortunately I must agree with AJ Cann above, and I think Alma was very generous in her response to this melodramatic, condescending rant.

This is the link to the randomized controlled trial of open access that Philip referred to

The RCT found more downloads with open access, but not more citations. This finding raises the question that authors are citing an article after only reading the abstract. I doubt it as I suspect most citing authors have personal or institutional access to the articles they cite. However, it might be worth studying the effect of open access on citation accuracy. Conceivably open access might reduce the phenomena described in “How citation distortions create unfounded authority: analysis of a citation network” (

I hardly think that Philip’s critique deserves the epithet of a ‘rant’; rather, it seems to me to be a well-argued defence of a properly scientific approach to the question. I notice that Prof. Swan does not take up the issue of the proper use of meta-analysis, no doubt because a defence of the choice of method is pretty well impossible. Philip is right, the studies are so different and so many intervening variables have not been taken into consideration, that to conclude that OA leads to increased citation is simply, in the Scottish court judgement, ‘not proven’. Coming to a personal opinion on an issue is not the same as demonstrating, scientifically, the probable truth of a hypothesis. Personal opinions are valid – especially for the person holding them, but they ought not be be published as if they were the result of genuine scientific research.

All this talk of meta-analysis makes me think of the excellent Cochrane systematic reviews. There they perform an exhaustive search for relevant studies and in order to remove any suspicion of search bias they publish the exact search methodology adopted. They then examine the quality of the methodology of each study before deciding whether to include or not. If and only if the design quality of the identified studies are of sufficiently high quality and sufficiently similar, then the data can be combined into a meta-analysis. In the the hierarchy of evidence a systematic review containing one or more well designed RCT’s is classed as the highest form of evidence. Currently we appear to have one RCT which didn’t find a positive OA citation effect, and then a variety of other studies, some better designed than others, some which found an OA effect, and some which didn’t. Talk of a meta-analysis is, in my opinion, premature, and until then I’d have to back the RCT.

Perhaps someone would like to take a crack at explaining whether or not the 27 OA-positive studies are valid and why?

It just seems to be splitting hairs to say that because you can’t do a formal meta-analysis, you can’t consider the totality of studies done and what the general consensus was.

Is anyone else struck by the amazing similarity of this debate to the climate change debate? You have your ideologues on both ends: the evangelists and the deniers who are most interested in promoting a preconceived and immutable dogma and who cherry-pick their evidence to “prove” their righteousness.

And there are the thinkers closer to the center: the believers and the skeptics who focus mainly on the actual data and what it tells us, and who are more willing to adjust theory to fit the facts.

But the conversation is so easily co-opted and shouted down by the ideologues that it becomes impossible to have any kind of reasoned debate in any setting. At least with OA, millions probably will not wind up floating around on rafts if the believers win out in the end.

It shouldn’t be surprising. Harnad posted no less than 5 rapid responses to our article when it was published back in 2008 and I responded to his concerns/criticisms back then.

The latest rant of his seems to be an attempt to divert attention from the criticism his coauthor received on a recent manuscript, see:

In fact, I have to date not seen any adequate response from David in this matter (Harnad’s specific questions he refers to could not have been anwered “back then” because they were formulated later and in light of new evidence.) Instead he simply refuses to answer.

I cannot see that there is any resemblance at all between the two debates. I doubt that Philip is a ‘denier’ of the numerous advantages of OA and, as the editor of an OA journal, I am certainly an advocate. One can doubt the scientific value of a study without denying the value of the thing studied – in the climate debate, the deniers deny the possibility that human use of carbon-based resources is a cause of climate change. Quite a different kind of position. I’m also puzzled as to where one finds these ‘ideologues’ – I cannot see that ideology of any kind is raised in the discussion.

I would categorize Phil more in the skeptic camp, not a denier. Someone who wants to see the data demonstrate convincingly that the citation effect is real, and so far has not seen it. I’ve never thought he is opposed to OA in principle.

Ideologues are those who support or oppose OA regardless of evidence one way or another, and who won’t be convinced by any reasonable arguments to the contrary. All the evidence in the world isn’t going to move them in the other direction. I’m not implying political Left or Right in this matter; it’s merely an analogy.

Far more issues about OA and meta analysis have been raised in this thread for me to comment on. But having dedicated 35 years of my efforts to meta analysis and 20 to OA, I can’t resist a couple of quick observations.
Holding up one set of methods (be they RCT or whatever) as the gold standard is inconsistent with decades of empirical work in meta analysis that shows that “perfect studies” and “less than perfect studies” seldom show important differences in results. If the question at hand concerns experimental intervention, then random assignment to groups may well be inferior as a matching technique to even an ex post facto matching of groups. Randomization is not the royal road to equivalence of groups; it’s the road to probability statements about differences.
Claims about the superiority of certain methods are empirical claims. They are not a priori dicta about what evidence can and can not be looked at.

I happen to have admin access to an open access journal, We had over 16,000 article views in the first 15 months of existence. If I magically turned the journal into a subscription journal for a year (Don’t worry, I’m not going to do it), you can bet that our article views would go down significantly. It would then follow that our citation rate would also go down.

For all of the society presses out there that provide free backfile access to their content (1 year embargo or whatever), they know that they have a HUGE number of article downloads, and hence higher citation rates for the older articles, too. Why don’t you see the connection? It has been shown that articles/journals that have more downloads also have more use and more citations.

In consumer publishing the number of content views is the gold standard because it is indicative of the content’s monetary potential. In scholarly publishing the citation count is the (current) gold standard because it indicates scientific “importance” independent of usage – a valuable social function. It is therefore alarming to learn that citation counts may be inflated simply because of the author’s chosen publication format. If true, this phenomenon dilutes the value of citation counts as reliable indicator of societal value because it suggests that the count is (at least partly) the result publication format rather than content quality. If accurate, the research does not make the case for or against OA, but simply suggest that in future citation counts (and impact factors etc.) should be normalized to reflect the “format effect”.

Isn’t this backwards? Isn’t the ‘social function’ of citation skewed by differential visibility of papers? Imagine a world in which all papers are OA, equally visible to search engines? In such a world, it does not matter if a paper was published in a GlamourMag or a tiny society journal, or arXiv – the good papers would be noticed, downloaded and yes, cited more. Thus the ‘social function’ of citation counting would finally approach something like real reflection of the quality of the work (let’s now put aside all the problems with citations in general: citing bad work to criticize it, copy+pasting classics in the field without reading, mis-citation of papers for saying something they don’t say, self-citation, preferentially citing friends, etc.)

Yes, in a very literal sense a document with valuable information that is read by more people provides greater benefit to society. But equally (more?) important is the “feedback signal” from the scholarly publishing process that informs future research investment decisions. This can’t just be a readership popularity contest. Hundreds of billions are spent each year on research, so it’s critical for the citation “signal” to be clear and unbiased by that author’s choice of publication format. I agree the problem of measuring relative impact in different publication models goes away if all publication is in the OA model.

How do we know what “early expectations” of open access were, before people started looking at the effects it had in practice on citation counts? Were there studies published? If so, reference to those would be very helpful.

Comments are closed.