Which journal performed better in 2015: eLife or PLOS Biology? Most scientists and publishers would answer this question by looking up their reported Journal Impact Factor (JIF) scores. Others staunchly disagree.
In a recent paper posted to bioRxiv, Vincent Larivière, Associate Professor of Information Science at the University of Montreal, along with a group of distinguished science editors and publishers, argue that we are putting too much emphasis on this single summary statistic. Instead, we should focus our attention on the distribution of article citations within a journal, reminding ourselves that citation distributions are skewed, that every journal contains a small percentage of highly performing articles, and that these outliers can distort the average performance of a journal.
While Larivière concedes that these characteristics of citation distributions “are well known to bibliometricians and journal editors, they are not widely appreciated in the research community.” His solution to widespread ignorance is a histogram.
Simply adding a histogram isn’t going to reduce the misuse of the JIF.
Are scientists, librarians, and funders really ignorant about citation counts? Is there widespread misconception that if a paper was published in Nature (JIF=38.138), that every paper published therein received exactly 38.138 citations? Are the notions of uncited papers and highly-cited papers completely foreign to everyone but bibliometricians and journal editors? If you accept this tenet, then perhaps we all need a little more education.
Larivière and others are proposing that each journal calculate and display a histogram of their citation distribution to accompany the JIF. Most of his paper is devoted to screen shots of how to download citation data from various sources and create histograms using Excel.
Methods aside — most stand-alone and web-based statistics programs do a far better a job with creating histograms than Excel — Larivière argues that publishing the citation distribution “provides a healthy check on the misuse of JIFs by focusing attention on their spread and variation, rather than on single numbers that conceal these universal features and assume for themselves unwarranted precision and significance.”
Does a histogram really provide such a health check on JIF misuse?
Consider the following two histograms, the top (blue) reports the citation distribution for eLife, and the bottom (green) for PLOS Biology. Which one is better? Squint your eyes, adjust your glasses. They look pretty similar, right? Or do they?
Their distributions are similar but not identical. You’ll note that eLife published many more uncited and low cited papers than PLOS Biology, but it also published many more highly cited papers. This is to be expected as eLife published twice as many papers in 2013 and 2014.
I’ve included summary statistics for these distributions. The mean (average) citation score for eLife was 8.147, compared to 7.856 for PLOS Biology. If our primary indicator is average article performance, eLife scored higher. Now, let’s compare other summary statistics:
The median citation score for each journal was 6 citations/paper, with the identical Interquartile Range (3 to 10 citations). By these measures, the journals perform the same.
If you do more elaborate statistical analysis and control for the date of publication of each paper (remember that we are measuring the performance of two full years of papers at a single point in time), the differences between these journals are insignificant. Statistically speaking, research and review papers published in these two journals performed about the same, which is the same information we got from their JIF with far less effort. Eyeball the other distributions that Larivière plots in his paper and you arrive at the same conclusion: Nature and Science have very different JIFs and citation distributions than PLOS ONE and Scientific Reports.
While it is clear that Larivière and his authors are not fond of the JIF, their alternative — showing all the data — does not adequately address their litany of criticisms.
Descriptive statistics (mean, median, percentiles, Interquartile Ranges, among others) were invented for the very purpose of summarizing data and allowing for their comparison, which is exactly what the JIF was designed to do. While it is clear that Larivière and his authors are not fond of the JIF, their alternative — showing all the data — does not adequately address their litany of criticisms.
If their primary complaint is about skewed distributions, then advocate for reporting median citation scores (understanding that it will result in a lot of ties, especially at the low end). If their issue is about the range of citation scores, report the Interquartile Range (IRQ). If their issue is about precision, then advocate for the reporting of the journal’s quartile or percentile. Readers who routinely use citation indexes will understand that these metrics are already available, although almost universally ignored.
This is not a problem about transparency; this is a problem about collectively agreeing upon an indicator that provides a fair comparison among groups of competing journals.
Simply adding a histogram isn’t going to reduce the misuse of the JIF. If anything, it will have us all squinting harder at our computer screens.