Is the influence of a journal best measured by the number of citations it attracts or by the citations it attracts from other influential journals?
The purpose of this post is to describe, in plain English, two network-based citation metrics: Eigenfactor and SCImago Journal Rank (SJR), compare their differences, and evaluate what they add to our understanding of the scientific literature.
Both Eigenfactor and SJR are based on the number of citations a journal receives from other journals, weighted by their importance, such that citations from important journals like Nature are given more weight than less important titles. Later in this post, I’ll describe exactly how a journal derives its importance from the network.
In contrast, metrics like the Impact Factor do not weight citations: one citation is worth one citation, whatever the source. In this sense, the Eigenfactor and SJR get closer to measuring importance as a social phenomenon, where influential people hold more sway over the course of business, politics, entertainment and the arts. For the Impact Factor, importance is equated with popularity.
Eigenfactor and SJR are both based on calculating something called eigenvector centrality, a mathematical concept that was developed to understand social networks and first applied to measuring journal influence in the mid-seventies. Google’s PageRank is based on the same concept.
Eigenvector centrality is calculated recursively, such that values are transferred from one journal to another in the network until a steady-state solution (also known as an equilibrium) is reached. Often 100 or so iterations are used before values become stable. Like a hermetically sealed ecosystem, value is neither created nor destroyed, just moved around.
There are two metaphors used to describe this process: The first conceives of the system as a fluid network, where water drains from one pond (journal) to the next along citation tributaries. Over time, water starts accumulating in journals of high influence while others begin to drain. The other metaphor conceives of a researcher taking a random walk from one journal to the next by way of citations. Journals visited more frequently by the wandering researcher are considered more influential.
However, both of these models break down (mathematically and figuratively) in real life. Using the fluid analogy, some ponds may be disconnected from most of the network of ponds; if there is just one stream feeding this largely-disconnected network, water will flow in, but will not drain out. After time, these ponds may swell to immense lakes, whose size is staggeringly disproportionate to their starting values. Using the random walk analogy, a researcher may be trapped wandering among a highly specialized collection of journals that frequently cite each other but rarely cite journals outside of their clique.
The eigenvector centrality algorithm can adjust for this problem by “evaporating” some of the water in each iteration and redistributing these values back to the network as rain. Similarly, the random walk analogy uses a “teleport” concept, where the researcher may be transported randomly to another journal in the system–think of Scotty transporting Captain Kirk back to the Enterprise before immediately beaming him down to another planet.
Before I continue into details and differences, let me summarize thus far: Eigenfactor and SJR are both metrics that rely on computing, through iterative weighting, the influence of a journal based on the entire citation network. They differ from traditional metrics, like the Impact Factor, that simply compute a raw citation score.
In practice, eigenvector centrality is calculated upon an adjacency matrix listing all of the journals in the network and the number of citations that took place between them. Most of the values in this very large table are zero, but some will contain very high values, representing large flows of citations between some journals, for instance, between the NEJM, JAMA, The Lancet, and BMJ.
The result of the computation–a transfer of weighted values from one journal to the next over one hundred or so iterations–represents the influence of a journal, which is often expressed as a percentage of the total influence in the network. For example, Nature‘s 2014 Eigenfactor was 1.50, meaning that this one journal represented 1.5% of the total influence of the entire citation network. In comparison, a smaller, specialized journal, AJP-Renal Physiology, received an Eigenfactor of 0.028. In contrast, PLOS ONE’s Eigenfactor was larger than Nature’s (1.53) as a result of its immense size. Remember that Eigenfactor measures total influence in the citation network, so big often translates to big influence.
When the Eigenfactor is adjusted for the number of papers published in each journal, it is called the Article Influence Score. This is similar to SCImago’s SJR. So, while PLOS ONE had an immense Eigenfactor, its Article Influence Score was just 1.2 (close to average performance), compared to 21.9 for Nature and 1.1 for AJP-Renal Physiology.
In 2015, Thomson Reuters began publishing a Normalized Eigenfactor, which expresses the Eigenfactor as a multiplicative value rather than a percent. A journal with a value of 2 has twice as much influence as the average journal in the network, whose value would be one. Nature‘s Normalized Eigenfactor was 167, PLOS ONE was 171, while AJP-Renal Physiology was 3.
There are several differences between how the Eigenfactor and SJR are both calculated, meaning they cannot be used interchangeably:
- Size of the network. Eigenfactor is based on the citation network of just over 11,000 journals indexed by Thomson Reuters, whereas the SJR is based on over 21,000 journals indexed in the Scopus database. Different citation networks will result in different eigenvalues.
- Citation window. Eigenfactor is based on citations made in a given year to papers published in the prior five years, while the SJR uses a three-year window. The creators of Eigenfactor argue that five years of data reduces the volatility of their metric from year to year, while the creators of the SJR argue that a three-year window captures peak citation for most fields and is more sensitive to the changing nature of the literature.
- Self-citation. Eigenfactor eliminates self-citation, while SJR allows self-citation but limits it to no more than one-third of all incoming citations. The creators of Eigenfactor argue that eliminating self-citation disincentivizes bad referencing behavior, while the creators of the SJR argue that self-citation is part of normal citation behavior and wish to capture it.
There are other small differences, such as the scaling factor (a constant that defines how much “evaporation”or “teleporting”) that takes place in each iteration. While both groups provide a full description of their algorithm (Eigenfactor here; SJR here) it is pretty clear that few of us (publishers, editors, authors) are going to replicate their work. Indeed, these protocols assume that you’ve already indexed tens of thousands of journals publishing several million papers listing tens of millions of citations before you even begin to assemble your adjacency matrix. And no, Excel doesn’t have a simple macro for calculating eigenvalues. So while each group is fully transparent in its methods, the shear enormity and complexity of the task prevents all but the two largest indexers from replicating their results. A journal editor really has no recourse but to accept the numbers provided to him.
If you scan performances of journals, you’ll notice that journals with the highest Impact Factor also have the highest Article Influence and SJR scores, leaving one to question whether popularity in science really measures the same underlying construct as influence. Writing in the Journal of Informetrics, Massimo Francechet reports that for the biomedical, social sciences, and geosciences, 5-yr Impact Factors correlate strongly with Article Influence Scores, but diverge more for physics, material sciences, computer sciences, and engineering. For these fields, journals may perform well one one metric but poorly on the other. In another paper focusing on the SJR, the authors noted some major changes in the ranking of journals, and reported that eigenvalues tended to concentrate in fewer (prestigious) journals. Considering how the metric is calculated, this should not be surprising.
In conclusion, network-based citation analysis can help us more closely measure scientific influence. However, the process is complex, not easily replicable, harder to describe and, for most journals, gives us the same result as much simpler methods. Even if not widely adopted for reporting purposes, the Eigenfactor and SJR may be used for detection purposes, such as identifying citation cartels and other forms of citation collusion that are very difficult to detect using traditional counting methods, but may become highly visible using network-based analysis.
1. Eigenfactor (and Article Influence) are terms trademarked by the University of Washington. Eigenfactors and Article Influence scores are published in the Journal Citation Report (Thomson Reuters) each June and are posted freely on the Eigenfactor.org website after a six-month embargo. To date, the University of Washington has not received any licensing revenue from Eigenfactor metrics.
2. The SCImago Journal & Country Rank is based on Scopus data (Elsevier) and made freely available from: http://www.scimagojr.com/