Do institutional mandates requiring authors to self-archive their papers lead to higher citation rates? A new analysis argues that it does, yet closer inspection may give one pause.

The article, **“Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research,”** by Yassine Gargouri and others at the University of Quebec at Montreal, was deposited into the arXiv on January 3rd by self-archiving advocate Stevan Harnad, who is the last author of this paper. [Postnote: the article was published on Oct 18 in *PLoS ONE*]

Comparing 6,000 mandated self-archived papers deposited in four institutional repositories (Southhampton University, CERN, Queensland University of Technology, and Minho University in Portugal) with 21,000 control articles selected by title word similarity, the researchers were interested in isolating and measuring the citation effect when authors willingly deposit articles on their own accord (self-selection) versus when their institution mandates deposit.

Discovering an independent citation advantage for mandated self-archived articles would suggest that open access (OA) is likely a real cause of increased citations; several articles in the past have argued that higher-quality articles are more likely to be self-archived and that the relationship between open access and increased citations is merely a spurious association (e.g. Kurtz (2005, 2007), Moed (2007), Davis (2007, 2008)).

Gargouri reports that institutionally-mandated OA papers received about a 15% citation advantage over self-selected OA papers, which seems somewhat counter-intuitive. If better articles tend to be self-archived, their reasoning goes, we should expect that papers deposited under institutional-wide mandates would *under-perform *those where the authors select which articles to archive. The authors of this paper deal, rather unscientifically, with this inconvenient truth with a quick statistical dismissal — that their finding “might be due to chance or sampling error.” But even this explanation doesn’t hold water — their data set is large enough to detect even very small effects, and they report a statistically significant effect (p=0.048) in their appendix. This fact seems to be conveniently ignored.

Similar inconsistencies are present throughout. For instance, an independent effect due to institutional self-archiving mandates is present for articles that receive low-to-medium numbers of citations (Figure 4), but not present for articles that fall in both the low- and high-citation groups.

The main weakness in this study stems from how the researchers deal with their data, forming ratios that compare the logarithm of citation performance of different kinds of articles (e.g. OA, mandated versus OA, self-selected). Why is this so problematic? Consider the following citation ratio scenarios:

- log 3/log 2 = 1.6
- log 30/log 20 = 1.14
- log 1/log 2 = 0
- log 2/log 1 = logical error
- log 2/log 0 = logical error

In comparing scenario #1 with #2, both show the same raw citation ratio (3/2 = 30/20) and yet when you take the log of these numbers before dividing you get very different answers — a 60% citation differential for #1 but only a 14% differential for #2. In scenario #3, a ratio which demonstrates a 50% citation difference before log transformation, becomes 0% after transformation. Furthermore, any article in the numerator that receives one or zero citation makes the entire ratio unusable (e.g., #4 and #5). Henk Moed criticized this approach (used by Harnad and Brody in an earlier paper) as being methodologically problematic since it can result in extremely high ratio values for very small citation differences. You can see the effect of this ratio approach in Figure 2 where the most recently year of papers (2006) shows extraordinarily high impact.

Given the passionate language used in this article, a reader may come to the conclusion that its authors are not being driven by the data, but are bent on selectively reporting and interpreting their results while ignoring those inconvenient truths that do not conform with their preconceived mission — that all institutions should establish mandatory self-archiving policies. They write:

Overall, only about 15% of articles are being spontaneously self-archived today, self-selectively. To reach 100% OA globally, researchers’ institutions and funders need to mandate self-archiving, as they are now increasingly beginning to do. We hope that this demonstration that the OA Impact Advantage is real and causal will give further incentive and impetus for the adoption of OA mandates worldwide in order to allow research to achieve its full impact potential, no longer constrained by limits on accessibility

Someone with limited statistical background will find himself overwhelmed with complicated bar charts and may find it preferable to cite from the abstract. Those with the ability to plow through the analysis may find the authors’ approach rather blunt and repetitive, when simpler and more elegant approaches are available.

Still, the researchers report what others have established before them:

- that characteristics of the article other than access status (e.g. type of article, number of co-authors, length of article, journal impact factor, field and country of authorship) are predictive of future citations,
- that citations are concentrated among a small group of papers, and
- initial citation differences tend to be amplified over time.

What is conspicuously missing from the discussion is that the citation advantage attributed to free access (Figure 2, OA/not-OA) is much smaller — by a factor of 10 (20% versus 200+%) — than was previously stated, and echoed relentlessly and unconditionally, by these same researchers. This much smaller effect size, if truly attributable to access, is more in line with other rigorous studies.

In sum, this paper tests an interesting testable hypothesis on whether mandatory self-archiving policies are beneficial to their authors in terms of citations. Their unorthodox methodology, however, results in some inconsistent and counter-intuitive results that are not properly addressed in their narrative.

Given that one of its authors is bent on “ram[ming] open access down everybody’s throats,” I think you’ll hear a lot more on this article.

Thanks for critiquing these findings for those of us not as fluent in statistical sleight of hand.

I believe this paper starts from two very good ideas: doing comparisons within the same journal for articles similar in terms of keywords, and looking at differences induced by institutional mandates.

The statistical methodology is somewhat obscure to me and I would be much more confortable with a negative binomial regression with journal fixed effects.

On a more substantial level, the paper does not appear to include controls for institutions of the authors of the control sample. This is particularly worrisome when comparing papers originating from CERN which arguably does cutting edge physics to the control papers.

The key issue in this paper seems to be interpreting the mandated open access versus self-selected open access. The authors find and point out that the mandates actually result in compliance of around 60%. However, they have little to say on what is going on here and why papers end up in the compliant group or not. I am not sure what conclusions can be inferred from this comparison of two types of self-selection, at least one of which is not well understood.

Patrick Gaule

Thanks for your comments.

(1) NEGATIVE BINOMIAL ANALYSIS: If the referees call for a negative binomial analysis in addition to our logistic regressions, we will be happy to analyze the data that way too.

(2) THE SPECIAL CASE OF CERN: With so few institutional mandates, it’s not yet possible to control for institutional quality. But CERN is indeed a special case; when it is removed, however, it does not alter the pattern of our results. This point is discussed in the paper:

– “CERN articles have higher citation counts in the lowest and especially the highest citation range. However, when all CERN articles are excluded from our sample, there is no significant change in the other variables… “Nor are the main effects the result of institutional citation advantages, as the institutions were among the independent predictor variables partialled out in the logistic regression; the outcome pattern and significance is also unaltered by removing CERN, the only one of the four institutions that might conceivably have biased the outcome because its papers were all in one field and tended to be of higher quality, hence higher citeability overall.”

(3) SELECTIVE COMPLIANCE? Mandate compliance is not yet 100%, so some form of self-selection still remains a logical possibility, but we think this is made extremely improbable when there is no decline in the OA Advantage even when mandates quadruple the OA rate from the spontaneous self-selective baseline of 15% to the current mandated average of 60%. This point too is discussed in the paper:

– “The OA advantage is just as great when the OA is mandated (with mandate compliance rate ~60%) as when it is self-selective (self-selection rate ~15%). That makes it highly unlikely that the OA advantage is either entirely or mostly the result of an author bias toward selectively self-archiving higher quality – hence higher citeability – articles… This mandated deposit rate of 60% is substantially higher than the self-selected deposit rate of 15%. Although with anything short of 100% compliance it is always logically possible to hold onto the hypothesis that the OA citation advantage could be solely a self-selection bias (arguing that, with a mandate, the former bias in favor of self-selectively self-archiving one’s more citeable articles takes the form of a selective bias against compliance with the mandate for one’s less citeable articles), but a reasonable expectation would be at least a diminution in the size of the OA impact advantage with a mandated deposit rate four times as high as the spontaneous rate, if it is indeed true that the OA advantage is solely or largely due to self-selection bias.”

I am still worried about the effect of institutional differences on the analysis. The authors have indeed performed a separate analysis in which CERN is left out, but this doesn’t solve the problem. The three other institutes may also produce papers whose quality is on average above the quality level of similar papers (published in the same journal), and this may explain any differences in citation counts. Hence, differences in citation counts may be due to at least two reasons: (1) quality differences among institutes and (2) the presence or absence of a mandatory open access policy. Using the current design, it is not possible to distinguish between these two possibilities.

Suppose that we take my own institute, Leiden university, and that for each paper from my institute we find 10 similar papers from other institutes (published in the same journal). Suppose next that we compare the citation counts of the two sets of papers. My guess is that the papers from my institute will on average receive more citations than the papers from the other institutes. However, there is no mandatory open access policy at my institute. So where do these above average citations come from? A possible explanation is that these citations are due to quality differences between my institute and other institutes doing similar research. This possibility is ignored in the analysis of Gargouri et al.

(a) The logistic regressions in Figures 4 and 7-11 test whether the four mandated institutions make a significant independent contribution to the citation counts in 5 journal impact-factor ranges and 4 citation quartiles for each. Most of the OA citation advantage is concentrated in the top of the four citation ranges in each figure. Of the 4 mandated institutions, Minho and QUT never have an independent citation advantage in those top citation ranges; CERN does twice (top range of Figuress 4 & 10), but removing CERN does not alter the pattern; and Southampton does once (top range of Figure 9), but removing Southampton too does not alter the results. The OA advantage is present and strong in all 5 of the top ranges.

See: SUPPLEMENTARY FIGURE S1:

http://eprints.ecs.soton.ac.uk/18346/7/Supp1_CERN%2DSOTON.pdf

(b) Each of the separate paired-sample t-tests for each of the four mandated institutions individually (Figures 12-16) likewise shows the same overall pattern: a citation advantage of the same size as for self-selected OA (as in Figure 2, which shows the joint effect for all four mandated institutions).

(c) If U. Leiden had been in our sample, it would have been computed as an unmandated institution. Its OA articles would have shown the usual OA advantage. In addition, if Leiden is indeed an above-average university, it would have shown an independent citation advantage, as CERN did, especially in the high journal impact-factor ranges and citation quartiles.

Thanks for the feedback. We reply to the three points of substance, in order of importance:

(1) LOG RATIOS: We analyzed log citation ratios to adjust for departures from normality. Logs were used to normalize the citations and attenuate distortion from high values. This approach loses some values when the log tranformation makes the denominator zero, but despite these lost data, the t-test results were significant, and were further confirmed by our second, logistic regression analysis. Moed’s (2007) point was about (non-log) ratios that were not used in this study. We used the ratio of log citations and not the log of citation ratios. When we compare log3/log2 with log30/log20, we don’t compare percentages with percentages (60% with 14%) because the citation values are transformed or normalized: the higher the citations, the stronger the normalisation. It is highly unlikely that any of this would introduce a systematic bias in favor of OA, but if the referees of the paper should call for a “simpler and more elegant” analysis to make sure, we will be glad to perform it.

(2) Effect Size: The size of the OA Advantage varies greatly from year to year and field to field. We reported this in Hajjem et al (2005), stressing that the important point is that there is virtually always a positive OA Advantage, absent only when the sample is too small or the effect is measured too early (as in Davis et al’s 2008 study). The consistently bigger OA Advantage in physics (Brody & Harnad 2004) is almost certainly an effect of the Early Access factor, because in physics, unlike in most other disciplines (apart from computer science and economics), authors tend to make their unrefereed preprints OA well before publication. (This too might be a good practice to emulate, for authors desirous of greater research impact.)

(3) Mandated OA Advantage? Yes, the fact that the citation advantage of mandated OA was slightly greater than that of self-selected OA is surprising, and if it proves reliable, it is interesting and worthy of interpretation. We did not interpret it in our paper, because it was the smallest effect, and our focus was on testing the Self-Selection/Quality-Bias hypothesis, according to which mandated OA should have little or no citation advantage at all, if self-selection is a major contributor to the OA citation advantage.

Our sample was 2002-2006. We are now analyzing 2007-2008. If there is still a statistically significant OA advantage for mandated OA over self-selected OA in this more recent sample too, a potential explanation is the inverse of the Self-Selection/Quality-Bias hypothesis (which, by the way, we do think is one of the several factors that contribute to the OA Advantage, alongside the other contributors: Early Advantage, Quality Advantage, Competitive Advantage, Download Advantage, Arxiv Advantage, and probably others). http://openaccess.eprints.org/index.php?/archives/29-guid.html

The Self-Selection/Quality-Bias (SSQB) consists of better authors being more likely to make their papers OA, and/or authors being more likely to make their better papers OA, because they are better, hence more citeable. The hypothesis we tested was that all or most of the widely reported OA Advantage across all fields and years is just due to SSQB. Our data show that it is not, because the OA Advantage is no smaller when it is mandated. If it turns out to be reliably bigger, the most likely explanation is a variant of the “Sitting Pretty” (SP) effect, whereby some of the more comfortable authors have said that the reason they do not make their articles OA is that they think they have enough access and impact already. Such authors do not self-archive spontaneously. But when OA is mandated, their papers reap the extra benefit of OA, with its Quality Advantage (for the better, more citeable papers). In other words, if SSQB is a bias in favor of OA on the part of some of the better authors, mandates reverse an SP bias against OA on the part of others of the better authors. Spontaneous, unmandated OA would be missing the papers of these SP authors. http://www.eprints.org/openaccess/self-faq/#29.Sitting

There may be other explanations too. But we think any explanation at all is premature until it is confirmed that this new mandated OA advantage is indeed reliable and replicable. Phil further singles out the fact that the mandate advantage is present in the middle citation ranges and not the top and bottom. Again, it seems premature to interpret these minor effects whose unreliability is unknown, but if forced to pick an interpretation now, we would say it was because the “Sitting Pretty” authors may be the middle-range authors rather than the top ones…

Yassine Gargouri, Chawki Hajjem, Vincent Lariviere, Yves Gingras, Les Carr, Tim Brody, Stevan Harnad

Brody, T. and Harnad, S. (2004) Comparing the Impact of Open Access (OA) vs. Non-OA Articles in the Same Journals. D-Lib Magazine 10(6). http://eprints.ecs.soton.ac.uk/10207/

Davis, P.M., Lewenstein, B.V., Simon, D.H., Booth, J.G., Connolly, M.J.L.

(2008) Open access publishing, article downloads, and citations: randomised controlled trial British Medical Journal 337:a568 http://www.bmj.com/cgi/reprint/337/jul31_1/a568

Hajjem, C., Harnad, S. and Gingras, Y. (2005) Ten-Year Cross-Disciplinary Comparison of the Growth of Open Access and How it Increases Research Citation Impact. IEEE Data Engineering Bulletin 28(4) 39-47. http://eprints.ecs.soton.ac.uk/11688/

Moed, H. F. (2006) The effect of ‘Open Access’ upon citation impact: An analysis of ArXiv’s Condensed Matter Section Journal of the American Society for Information Science and Technology 58(13) 2145-2156 http://arxiv.org/abs/cs/0611060

As far as I can see, what the paper by Gargouri et al. seems to show basically is that publications of the four institutes being studied on average receive more citations than other similar publications (in the same journal). What is not clear to me is why this difference should be due to an open access effect. Another explanation could be that the publications of the four institutes are on average of higher quality than other similar publications and that, as a consequence, the publications of the four institutes on average receive more citations. As is also remarked by Patrick Gaule, the analysis of Gargouri et al. does not seem to control for differences among institutes (such as one institute on average doing better research than another).

Ludo Waltman

Thanks for the feedback. Please see reply to Gaule, above.

Stevan,

Granted, you may be more interested in what the referees of the paper have to say than my comments; I’m interested in whether this paper is good science, whether the methodology is sound and whether you interpret your results properly.

For instance, it is not clear whether your Odds Ratios are interpreted correctly. Based on Figure 4, OA article are MORE LIKELY to receive zero citations than 1-5 citations (or conversely, LESS LIKELY to receive 1-5 citations than zero citations).

You write:

“For example, we can say for the first model that for a one unit increase in OA, the odds of receiving 1-5 citations (versus zero citations) increased by a factor of 0.957. Figure 4..”(p.9)Similarly in Figure 4 (if I understand the axes correctly), CERN article are more than twice as likely to be in the 20+ citation category than in the 1-5 citation category, a fact that may distort further interpretation of your data as it may be that institutional effects could explain your Mandated OA effect. See comments by Patrick Gaule (#2) and Ludo Waltman (#4) above.

Thank you for your comments.

As noted on page 9 of our draft, in the first model, for a one-unit increase in OA, the odds of receiving 1-5 citations (versus zero citations) increased by a factor of 0.957. The dependent variables are:

Cit_a_0&1-5 = 1 (and not 0 as Davis seems interpret) if the citation count (minus self-citations) is between 1 and 5

and

Cit_a_0&1-5 = 0 if the citation count (minus self-citations) = 0.

As noted in the paper, we re-analyzed the results with and without CERN, and the result pattern were the same. If the referees request it, we will include both analyses.

Thanks Yassine,

I’ve re-read page 9 and I’m still not sure what Cit_a_0&1-5 means, but an Odds Ratio that is below 1.0 means a DECREASE in odds, not an increase.

Your confusion is our fault. The problem is with our having used the value 0.957 by way of an illustration. We should have chosen a better example, where (Exp(ß)) is clearly greater than 1; the value 0.957 is too close to 1, but below 1, to serve as an illustration. We should have said: “For the second model, a one-unit increase in OA, the odds of receiving 5-10 citations (versus zero 1-5 citations) increased by a factor of 1.323.” (This clearer example will be used in the revised text of the paper.)

As indicated in some of the previous comments above, the rhetoric in the paper is rather uncompromising and “direct”; nevertheless the claim of causality seems well beyond the mark. Neither former research nor the current regression design permits any casual claims.

Further, the paper is flawed in its treatment of statistical significance – like many others papers on this and related citation advantage topics. I counted the word stem “significan-“ 22 times in the actual paper (and1 in the abstract) and only in two cases in combination with “statistically”. As far as I can see, all 23 mentions refer to “statistical significance”. The problem here is that “statistical significance” becomes a dichotomous decision tool for whether the results are practically, theoretically, or scientifically significant – this of course is a (widespread) misunderstanding. P-values are not indicative of the meaningfulness or importance of a finding. This can only be evaluated in the context of theory or application. In most cases the meaningfulness of a finding is reflected in the effect size or parameter estimate, and these estimates can have large or small p-values, depending on the sample size.

Using “significance tests” to say something about the results being due to chance is also a misunderstanding as a p-value is not indicative of the likelihood that the results were due to chance. This misunderstanding results from identifying p(Data|Ho) with p(Ho|Data). To evaluate the probability that the results were due to chance, one needs to know the a priori probability that the null hypothesis is true, in addition to other probabilities (the probability that the alternative hypothesis is true, the probability of the data under the alternative hypothesis).

So a p-value is the probability of these data (or more extreme) GIVEN that Ho is true! However, in non-experimental research, the null-hypothesis (i.e. typically a nil-hypothesis of no difference or no correlation) is almost always false. In the social sciences, “everything is correlated with everything else” to some non-zero degree, i.e. known as “the crud factor”. These correlations exist for a combination of interesting and trivial reasons. If the null hypothesis is unlikely to be true to begin with, testing the null hypothesis is not especially useful. Moreover, it is unclear how much support a theory should gain if the null hypothesis is rejected because there are hundreds of reasons why the null hypothesis (i.e., a statistical hypothesis not a theoretical hypothesis) may be false that have nothing to do with the theory of interest.

Finally, it is commonly believed that the p-value corresponds to the reliability or replicability of the result. Assuming the null hypothesis is false, the replicability of a result is a function of statistical power, and power is independent of the p-value of any one study. P-values cannot convey information about reliability.

One final note – the statistician George Box once said that all models are false, but some models are useful. The paradox, of course, is that we continue to make statements about random variables which are conditional on the assumption that the model describing them is true! Most often we discuss whether we have included and ensured control of various independent variables, but we should in fact start by questioning whether the randomization model needed for inferential statistics is true! If it is not, then p-values become meaningless. Where does it leave us and the authors of the paper? Size matters! Whether there is a difference or not is a trivial question, only the size of difference matters and no statistical software can tell you whether size matters, only hard thinking! Consequently, we need interpretations of the logged citation data and odd ratios and a discussion of whether their size matter (i.e., effect size), without relying on “statistical significance”.

Thanks for the feedback.

(1) EFFECT SIZE. If the referees call for an analysis of effect size, we will be happy to provide one.

(2) CAUSALITY: We agree that causality is difficult to demonstrate with correlational statistics. However, we note that the hypothesis that (2a) making articles open access causes them to be more citeable and the hypothesis that (2b) being more citeable causes articles to be made open access are both causal hypotheses.

Jesper Schneider’s comment is useful in bringing attention to the causality claims.

In economics we would say that a causal effect is identified only if the researcher is either directly manipulating the treatment through random assignment or otherwise exploiting a source of exogeneous variation whereby the treatment can be treated as if randomly assigned.

Phil Davis and coauthors (BMJ 2008) were precisely able to manipulate open access through random assignment and found no causal effect of open access on citations. Furthermore, their confidence intervals were small enough to rule out a causal effect of more than 10%.

I believe that remains by far the best available evidence of the effect of open access on citations.

Patrick Gaule

On causality, please see reply to Schneider.

On Davis et al (2008) Please see “Davis et al’s 1-year Study of Self-Selection Bias: No Self-Archiving Control, No OA Effect, No Conclusion” http://openaccess.eprints.org/index.php?/archives/441-guid.html

To demonstrate that an observed effect E (higher citations) is in reality an artifact of factor X (self-selective open access) one first has to demonstrate effect E, and then show that it is eliminated by eliminating factor X. Davis et al. did not demonstrate E (higher citation with self-selective open access) in their small, one-year sample. Hence all they showed was an absence of effect E; not that eliminating factor X eliminates effect E.

Thank you for your comments.

The formula on page 6 should read:

OM/OS = 1/n * S log(OM/OS)

There was an inadvertent error in how we described (not how we computed) this formula in the text (and we are grateful for this open feedback which allowed us to detect and correct it!).

There is an advantage in favor of OM when the log of the ratio is greater then 0, and in favor of OS otherwise.

The log transformation was used to normalize the data and attenuate the effect of articles with relatively high citation counts, compared to the whole sample. For example, to compare mandated OA (OM) with self-selected OA (OS), we computed the log of the ratio OM/OS for each journal and then we computed the arithmetic mean of all the logs of those ratios for each journal.

Thank you Yassine.

This change in how you calculate your ratios makes a VERY BIG difference to your paper.

Still, any ratio that has zero in the denominator is necessarily tossed from your dataset.

How much of your data did you have to toss for these calculations? And would tossing these ratios lead to any systematic bias in your dataset?

Would adding +1 to every citation count before transformation change your results (this is a common technique used with there are zeros in a dataset)?

Lastly, have you attempted to perform a regression analysis that treats your citation data as a variable (e.g. 0,1,2,3…) rather than categorical data (e.g. 0, 1-5, 10-20, 20+)?