There are some things in this world that follow a normal (bell-shaped), distribution: the height of an adult moose, human blood pressure, and the number of Cheerios in a box of breakfast cereal. For these kinds of things, it makes perfect sense to talk about average height, average pressure, and average number of O’s per box.
However, there are many more things in this world that follow a highly skewed (or long-tail) distribution: family income, concert ticket sales, article citations. For these measurements, it makes a lot more sense to talk about medians — the midpoint in the distribution. If we treat these distributions like moose, we would end up with a very distorted view of the central tendency of the distribution. The Rolling Stones and Taylor Swift sell a lot of concert tickets, while most musicians barely eke out a living playing Tuesday evening open mic.
At a recent publishing meeting, two senior biomedical scientists both lashed out at the Thomson Reuters representative sitting in the audience about how the Impact Factor was calculated and why the company couldn’t simply report median article citations. The TR representative talked about how averages (taken to three decimals) prevent ties and how her company provides many other metrics beyond the Impact Factor. They audience just about exploded, as it often does, and turned the discussion session into a series of pronouncements, counterpoints, and opinions. I’ve attended this circus before: the strong man flexes his muscles, the human cannonball is shot above the crowd, the trapeze artists slips and the audience gasps. At conferences, the penultimate event before the ringmaster ends the show is the old professor standing up and lecturing the crowd about how, in his day, scientists used to read the paper!
The purpose of this post is not to further the antics and showmanship that accompany any discussion about metrics, but to describe why we are stuck with the Impact Factor — as it is currently calculated — for the foreseeable future.
In the following figure, I plot total citations for 228 eLife articles published in 2013. You’ll first note that the distribution of citations to eLife papers is highly skewed. While the mean (average) citation performance was 22.1, the median (the point at which half of eLife papers do better and half do worse) was just 15.5. The difference between mean and median is the result of some very highly-cited eLife papers–one having received 321 citations to date.
So, if I can calculate a median citation performance for eLife, why can’t Thomson Reuters? Here’s why:
Sometime in the middle-to-end of March, Thomson Reuters takes a snapshot of its Web of Science Core Collection. For each journal, they will count all of the of references made in a single year to papers published in the two previous years. This number, reported annually in their Journal Citation Report (JCR) is a different number than what you will get by summing all article citations from the Web of Science. Why?
The reason for this disparity is that the JCR is trying to account for citation errors — incidences of references that cite the old name of a journal or the wrong page number or misspell the author’s name or omit an important detail like the year of publication. Authors, as any copy editor can attest, can be very bad citers. The result of citation errors is that papers don’t always link up to their intended target reference and why the Web of Science often undercounts the true number of cited references.
As an illustration, last year, eLife received 2377 citations to 255 citable items, for an Impact Factor of 9.322. If you went into the Web of Science and counted all of the citations made in 2014 to eLife papers published in 2012 and 2013, you would get only 2213 citations, or 7% fewer citations than were reported in the JCR. Put another way, if you relied on reference-target matching from the Web of Science, you would miss 164 intentional citations to eLife papers. For many journals that I study, the differences between citations reported in the JCR and Web of Science are on the order of a few percent. For others, the difference can be huge. The Web of Science appears to do a much worse job on e-only journals that report just article numbers rather than those that provide redundant information like volume, issue, and page number.
The counting method used in the JCR is much less strenuous than the Web of Science, and relies just on the name of the journal (and its variants) and the year of citation. The JCR doesn’t attempt to match a specific source document with a specific target document, like in the Web of Science. It just adds up all of the times a journal receives citations in a given year.
So, what does this have anything to do with medians? In the process of counting the total number of citations to a journal, the JCR loses all of the information that would allow them to calculate a median. While you can calculate an average by just knowing two numbers — total citations on the top, total citable items on the bottom — calculating a median requires you to know the performance of each paper.
Over beer with Janne Seppänen, co-founder of the Peerage of Science and an avid hunter, I explained this in terms of moose. (Janne knows a lot about moose!) If Janne were interested in calculating the average weight of a herd of moose, all he would need to know was their combined weight and the number in the herd. In order to calculate the median weight of a herd of moose, Janne would need to know the weight of each moose.
Thomson Reuters could, in theory, calculate a median citation metric, but it would need to abandon their two-product solution (Web of Science as a search and discovery tool; JCR as a reporting tool) for just a single product, something I explored in 2014 post. With its Intellectual Property and Science Business up for sale, it is highly unlikely that Thomson Reuters will invest anything in these products, let alone consider combining them. In contrast, Scopus, a competing index and metrics tool owned by Elsevier, is based on a single database model.
Companies that are up for sale don’t invest in new products nor attempt to reinvent themselves but are singularly focused on cutting costs. It is just as unlikely that the new owners of the Web of Science and the JCR would invest anything (except rebranding) as their new parent company will ultimately look at how much profit they can squeeze out of their new purchase over the next five-to-ten years. For a company that truly wishes to invent a better product, it would be easier to develop one from scratch rather than invest in a legacy system that dates back to the 1970s.
Last year, Thomson Reuters unveiled two new metrics in its 2014 JCR: A Normalized Eigenfactor score and a Journal Impact Factor Percentile score. There are now 13 metrics provided in their panel of journal scores, but don’t expect median citation scores to come any time soon. Thomson Reuters simply cannot calculate them using their current production model. Should a competitor be able to provide such a median performance metric, I fully expect those advocating for medians to readily adopt it, even if it means that their journal will appear far less competitive.