Image via Wikipedia
Earlier this year, I reported on a study by sociologist James Evans suggesting that online access to scientific journals is leading to more recent citations and a narrowing of the diversity of those articles which are cited.
This study was not taken at face value, and three information scientists (Vincent Larivière, Yves Gingras, and Éric Archambault) all at the University of Quebec in Montreal have released a new analysis taking aim at the diversity claim.
Their manuscript, “The decline in the concentration of citations, 1900-2007,” deposited September 30th in the arXiv, uses a simpler methodology. They report the percentage of papers that received at least one citation, the percentage of papers needed to account for 20%, 50%, and 80% of total citations, and the Herfindahl-Hirschman index, a measure used to estimate market concentration. The authors divide up their articles into four broad categories (natural sciences and engineering, medical fields, social sciences, and the humanities), and report their data in 2-year, 5-year, and 10-year citation windows. The figures are pretty unambiguous, with the exception of the humanities — all measures of diversity appear to be increasing, not decreasing.
Although the distributions of citations received are still highly concentrated and a minority of papers still account for a majority of the citations, this level of concentration has been decreasing over time
Unlike Evan’s article, this paper does not require knowledge of negative binomial regression, or any advanced statistics for that matter; and because of the simplicity and descriptive approach to their analysis, it is very convincing. Granted, Evans is using a different approach, looking at the effect of when journals became available online on citation behavior and whether commercial access or free access changes the outcome. For that reason, we should not discount the merit of Evan’s paper.
What makes this controversy interesting is that both studies make theoretical sense. A narrowing of science conforms to attention economics and preferential attachment (why the cited get more citations and the rest get ignored); a broadening of science conforms to information foraging theory, the principle of least effort, and the increasing ease of retrieving relevant articles. The results of both studies imply something different about the state of science, whether scientific information is being disseminated efficiently, and whether the literature is reflecting more diversity of opinion or more conformity. Larivière et al. write:
[O]ne can therefore argue that the scientific system is increasingly efficient at using published knowledge. Moreover, what our data shows is not a tendency towards an increasingly exclusive and elitist scientific system, but rather one that is increasingly democratic
Like many scientific controversies, the argument over citation diversity will move toward consensus and closure. For the meantime, the debate remains open.
Postscript (January 6, 2009)
In the January 2nd issue of the journal Science, Larivière et al. published a letter entitled “Literature Citations in the Internet Era,” in which they detail their own findings. They write:
Although many factors affect citation practices, two things are clear: Researchers are increasingly relying on older science, and citations are increasingly dispersed across a larger proportion of papers and journals.
Postscript (April 3, 2009)
The article, “The decline in the concentration of citations, 1900-2007” appears in the April issue of the Journal of the American Society for Information Science & Technology (2009, vol 60, p.858-862)