Call in the Citation Police! Image via Daniel Schwen.

Each June, editors, publishers, and authors anxiously await the release of the Journal Citation Report (JCR)–a dataset that reports, among other things, Journal Impact Factors for approximately 12,000 scholarly publications.

While new titles are added to the JCR each year, several dozens will be suppressed for “anomalous citation patterns.” Stated in more direct language, the JCR will kick out journals that attempt to game the system. Thomson Reuters, the publishers of the JCR, prefers to use language that does not imply intent; however, the result is the same. Last year, 38 titles were delisted from the JCR: 23 for extremely high rates of self-citation and 15 for “citation stacking” a term that may also be construed as a citation cartel.

As a tool, title suppression has strong moderating effect on citation behavior. Journals reinstated in the JCR after suppression have self-citation rates that are in-line with other journals in their field, and citation cartels (you cite my journal and I’ll cite yours) are stopped abruptly. The damage to the reputation of a journal is apparent for journals that slink back into the report. Even the threat of being suppressed is powerful enough to prevent editors from tempting fate. My clients tell me this all the time. The potential damage to their title (and their own reputations) is just too great to justify the risk.

This year, Scopus, a large, cross-disciplinary literature index product owned by Elsevier, announced that it would begin a similar annual evaluation process, directed to “maintain content quality” in its dataset. Scopus data is used to generate annual citation metrics in its SCImago Journal & Country Rank (SJR) product–a competitor to the JCR.

Like the JCR, Scopus will consider delisting a title if it engages in high levels of self-citation and provides a benchmark figure: 200% beyond self-citation rates of peer journals. In this way, Scopus provides some transparency in how it will make decisions but avoids establishing arbitrary cross-the-board levels. Scopus will let your discipline decide what is appropriate citation behavior. In contrast, the JCR is focused on how self-citation changes a journal’s ranking within a field. When a journal’s rank is radically shifted by self-citation, the JCR may use this as grounds for suppression.

Editors often ask me how much self-citation is too much, and I respond (honestly) that I don’t know. “Can you give me a ballpark?” (no). “Do you think this will be detected?” (I can’t say). What are self-citation rates in comparable journals (I can provide you with that!). While these answers are unsatisfactory to many editors, there is power in ambiguity. No editor wants to be the one responsible for having his journal suppressed.

When Scopus was first released in 2004, its main feature was scope, hence the name. At that time, the market was saturated with disciplinary-based indexes. The dominant interdisciplinary index at that time, the Science Citation Index (now part of the Web of Science) was more concerned with quality and selectivity than with comprehensiveness. If you wanted to be part of the Web of Science, you should expect a long and rigorous evaluation process. Not so with Scopus.

In the following years, the Web of Science responded by growing the size of its dataset and therefore the number of journals reported in its annual JCR metrics report. Last year, the JCR included 11,022 unique journal titles, about 50% more than in 2004. Amid this expansion, it did not relax its editorial standards for inclusion. Curating the Web of Science–and the journals included in its JCR–is as important today as it was 20 years ago.

By announcing their intention to police the content indexed in Scopus, and the data it generates for its own annual journal metrics report (SJR), Elsevier has signaled a similar intention toward content curation. At a time where more comprehensive indexes are provided to researchers for free (e.g. Google Scholar), comprehensiveness just doesn’t cut it, especially if you want to derive valid, authoritative metrics from your data.

Google Scholar may counter that it is possible, through their algorithms, to detect  “anomalous citation patterns” but it takes humans to investigate, make decisions, take responsibility, and remain ultimately accountable for their decisions. Computers cannot do that: you need people, people with expertise, and people whose expertise is respected. Not surprising, this costs a lot of money, which is why Google Scholar takes an algorithmic approach to data curation.

Personally, I find Elsevier’s SJR a far superior product to Thomson Reuter’s JCR; however, none of my clients have ever asked me to use SJR data. The SJR reports many of the same metrics as the JCR, for example, Cites per Doc (2yr.) measures the average citation performance of papers over a two-year window (example record here), which is exactly the same as the Journal Impact Factor. The SJR also reports self-citation rates among other metrics of interest to editors and publishers. Did I mention that the SJR is free?

What accounts for the continued dominance of the JCR in the citation metrics market, especially when faced with excellent free alternatives? In addition to data quality, historical persistence may play a role.

Established indicators (and the companies that produce them) persist, in part, because they provide longitudinal context. Many economists and financial professionals loathe the Dow Jones Industrial Average, a simple leading indicator first calculated by hand in 1896, and yet there is a huge value in continuing with the same indicator because it provides an historical perspective on how the economy is performing over time. Likewise, journal editors and publishers want to know how their journal is performing compared to last year and whether they are trending upward or downward. While other citation indicators may prove to be more valid and reliable, the Journal Impact Factor (and its name) provide that anchoring.

It appears that Elsevier understands this issue and is also in the process of indexing and adding pre-1996 material into Scopus. With older citation data, Scopus will also be able to calculate author h-indexes and other indicators that require having an author’s entire publication history.

In sum, the announcement that Scopus has established quality indicators suggests that they are now willing to invest in the most costly resource in building authority–people. In a world of abundant and cheap data, there is a real and growing demand for authority.

Correction: The original version of this post incorrectly stated that Impact Factor was a US trademarked term held by Thomson Reuters. 

Phil Davis

Phil Davis

Phil Davis is a publishing consultant specializing in the statistical analysis of citation, readership, publication and survey data. He has a Ph.D. in science communication from Cornell University (2010), extensive experience as a science librarian (1995-2006) and was trained as a life scientist. https://phil-davis.org/

View All Posts by Phil Davis


15 Thoughts on "Data Curation–The New Killer App"

Phil, do you have any sense as to how the two impact metrics compare? For example, do they count citable objects the same way? The difference in scope might also make a difference.

I like the SJR Index. There are more journals (and citations) to be counted including more small/niche journals that don’t garner enough citations to get into WOS. On that note, while I totally understand why Scopus is getting more stringent on which journals get to join the club, I do hope that it is more open than WOS. There are some good journals that serve important roles in small communities that get no respect from WOS but are included in Scopus.

Interestingly, niche journals are likely to have high self citation rates, because they alone publish most of the articles on their niche topic. Nor are there other journals on that topic, to compare their self citation rate with.

Hi Phil

Another cogent post, thanks for putting together. A footnote update, though, ‘Scopus’ was so-named after the chiff chaff (or ‘Phylloscopus collybita’) which the project development team spotted whilst on a retreat. Your explanation actually makes more sense, but sometimes true is stranger….

Thank you Phil for bringing this issue forward in the Scholarly Kitchen. We, at Thomson Reuters, have long believed both selectivity and, more importantly, consistency of policy leads to consistency in metrics. Because of our commitment to consistency, the Web of Science is simply more reliable for producing meaningful baselines, trend data, and normalized metrics.

Why is this important?

If you want to look at trends over time, you need to know that the content behind this year’s data points was built with the same methodology as content from 10 years ago and that data is not randomly being added or deleted. This concept applies to both journal level metrics such as the Impact Factor and the article level metrics found in InCites. If you want to compare across disciplines, you need to know that the same selection and indexing policies were applied to engineering and medicine. Thus, data built from Compendex indexing policies and Medline indexing policies (both of which allow for partial coverage of a journal and have different editorial and capture policies) can’t truly be normalized. In other words, it’s not about the number of papers or citations, it’s about the ability to create meaningful trends and comparisons. Partial coverage of journals makes it impossible to create a true expected citation rate per journal.

Quantitative and qualitative de-selection of journals is increasingly important with the proliferation of the open access business model in publishing. It is increasingly necessary to study the publishing quality of journals as they move to an open access model, as their profit is no longer dictated by the circulation of the finished product but by the number of papers that are submitted and selected. An increase in output from a journal or a decrease in citations are not in and of themselves issues, but is the journal maintaining the same number of quality articles?

While metrics such as citations per paper and the percent of papers cited may be symptoms of journal quality, it is in reading the content and studying the journal’s overall value to the subject area that needs to be considered before a decision to de-select can be made. Scopus’s recent announcement to start de-selecting is testament that our selective approach is the best one. However, their criteria are very internally based, not transparent to the outside, and are not likely to identify predatory behavior. Thomson Reuters approach to selection and de-selection is time-tested, consistent, and trusted throughout the library, academic, publishing, and funding communities. We rely on human assessment of big data computer analysis.

The value of having our own internal quality control team, our team of expert editors, is what has made achieving an Impact Factor a real accomplishment. Our process requires an investment of time and money. It is more than just a strategic initiative to us. It is a core belief and competency that follows policies and practices honed over decades. Web of Science is not just a multi-disciplinary database, it is a true citation index, and that is why it is the most subscribed to and used citation tool in the world.

Question related to the correction about “impact factor” (IF) not being a trademark of Thomson Reuters (TR). How generic do you consider the expressions IF and “journal impact factor” (JIF) to be? Can we call Scimago’s metric a “Scopus-based JIF”? Or should we reserve IF for Thomson Reuters’ product? How about giving preference to “journal impact metric” or simply “journal metrics” as more independent terms? When declarations are issued against over reliance on IF, is it against quantification in general or a single metric specifically? Thanks for your thoughts.

“Thomson Reuters, the publishers of the JCR, prefers to use language that does not infer intent.” I think you mean “imply”?

D’oh! As the editor of the site I’ll take the blame for this one. Fixed, thanks.

Thank you Phil for this interesting post and thanks to those who have commented so far.

Content and data curation were built into Scopus from the start. Selectivity for quality is essential and established across many high-value indexes, but a key goal of Scopus coverage was meeting the needs of the diverse and global scholarly community. To Angela’s point above, we began with a focus on the important contribution of regional, local and niche journals that serve smaller, often under-represented communities. All journals, general or specialized are equally integrated in the Scopus database and fully benefit from the quality standards, functionality and tools available via Scopus.com.

To ensure that we identify the journals most relevant to the community, we involve the community directly through our Content Selection & Advisory Board (CSAB) [http://www.elsevier.com/solutions/scopus/content/scopus-content-selection-and-advisory-board]. This independent group of experts was installed shortly after the start of Scopus and comprises subject-specific experts from around the world, many of whom have direct experience as journal editors. The board members are not employees of Elsevier and have agreed to share their expertise with us to ensure that researchers and scholars are not just observers of our process; they are also stakeholders and contributors.

The CSAB plays an important role in the development and refinement of a transparent and robust title evaluation process, as well as maintaining clear criteria for content selection. Journal suggestions are submitted through an online form [http://suggestor.step.scopus.com/suggestTitle/step1.cfm], and the reviewers of the CSAB assess the titles using the same online system, allowing for a continuous and transparent review process. No journal is accepted based on publisher or other coverage, but is reviewed for its contribution to the literature using a combination of quantitative metrics, and the crucial expertise of the researchers to maintain a process that considers the widest possible array of information. To date, the CSAB accepts 46% of the journals suggested, building a collection that is relevant, needed, and respected. Complete information on the Scopus review process and selection criteria is available on our info site [http://www.elsevier.com/solutions/scopus/content/content-policy-and-selection].

It would be naïve to think that the performance of journals will stay the same over time, so we consider curation of the data already covered in the database to be no less important than the selection of content. Many journals benefit from the increased visibility and accessibility of their content through the global reach and use of Scopus, and we have observed this through an increase in output, citation and usage. However, there are also journals that alter their editorial policies and publishing standards and underperform over time.

Publication malpractice is a clear reason for discontinuing the journal in Scopus, but this should be surfaced based on objective measures that serve as common warning signs. We realize that discontinuing coverage of a journal has a huge impact on the research community reading and writing for that journal. Therefore as described in this Scopus blog post [http://blog.scopus.com/posts/titles-indexed-in-scopus-check-before-you-publish], we encourage authors who are uncertain to verify Scopus coverage of a title through the publicly-available Scopus title list [http://www.elsevier.com/__data/assets/excel_doc/0015/91122/title_list.xlsx] and, at need, by contacting Scopus Helpdesk directly.

We are committed to detecting poor performing journals so that their relevance in the Scopus corpus can be re-evaluated. Our re-evaluation metrics and benchmarks were a response to the need for formal and transparent process that will flag such poor performing journals. Once again, we relied on our CSAB to develop and validate the process. Knowing the stake that researchers have in our coverage of a title, we want to explain and expand our decision-making and minimize surprises. We have posted the results of our work with CSAB in a blog post entitled “Scopus launches annual journal re-evaluation process to maintain content quality” [http://blog.scopus.com/posts/scopus-launches-annual-journal-re-evaluation-process-to-maintain-content-quality] and we will update the blog as we make progress and /or as questions come in from the community. The metrics and benchmarks indicated in the re-evaluation process are only used to flag particular journals that are underperforming. Journals that are flagged in two consecutive measurements will be forwarded to the CSAB for assessment using the same selection criteria used for new coverage. The decision to discontinue a journal will rest with the CSAB.

Scopus is the chosen source of bibliometric data for many rankings and national research assessments exercises such as Times Higher Education [http://blog.scopus.com/posts/times-higher-education-choose-scopus-data-for-its-world-university-ranking], QS, Research Excellence Framework (REF, UK) [http://www.elsevier.com/__data/assets/pdf_file/0003/81633/embracing_the_ref_2014_Online.pdf], Excellence in Research for Australia (ERA). To Lisa’s point above, these partnerships were a result, not a cause of our content strategy and policy. They acknowledge our continued commitment to data quality and curation.

Providing the right sets of diverse metrics and being able to provide longitudinal context is important indeed. Scopus currently includes full citation linking back to 1996, but source abstracts and bibliometric data go back as far as 1823 for some publisher archives. In addition, last year we announced the cited references expansion program which will add citation information and enable linking back to 1970 [http://blog.scopus.com/posts/scopus-continues-to-add-pre-1996-citations]. This project is about 25% complete as of this date, with a projection of 80% completion by end of year 2015, and final completion by 2016. This will widen the citations available for established researchers’ h-indexes, rich and broad citation networks on foundational works, and long-term trending of the full selection of citation metrics in Scopus and in Elsevier’s SciVal.

The positive response to SCImago Journal Rank (SJR) is great to see. SJR and SNIP were developed in partnership with bibliometrics researchers and represent important and different aspects of journal performance and consider more nuanced features of real-world scholarly citation exchanges that go beyond counts. We do count citations as well, if that’s the metric you need, but we prefer a broader definition of performance than can be achieved by one metric in isolation, this “basket of metrics” approach should include more than citations, and more than downloads, and more than Tweets.

All journals included in Scopus receive journal metric values and the extensive, global coverage of Scopus allows many journals that do not have an impact factor to view and understand their role in the citation network. All Scopus-based journal metric values are available for free through http://www.journalmetrics.com. If researchers must consider citation metrics, they should be able to see them.

Finally, thank you Fiona for bringing up the naming of Scopus. Phil’s explanation of the wide scope of the database is terrific, but a fortunate accident. Scopus was named after the ChiffChaff (Phylloscopus collybita), a small bird with excellent navigation [http://en.wikipedia.org/wiki/Common_chiffchaff].

Thanks Wim, as a birder I love the name origin story. Warblers are cool. Perhaps you can answer the question that I asked Phil above. That is, do you have any sense as to how the two impact metrics compare, or differ? For example, do they count citable objects the same way? How does Scopus identify and count them?

Dear David,

I am sorry that I did not address your question in my first comment. My response was already on the long side and I had to keep it quite general. However, I am glad you asked and I will go in a bit more detail on the journal metrics here.

At Scopus we currently have three journal metrics: SCImago Journal Rank (SJR), Impact per Publication (IPP) and Source Normalized Impact per Paper (SNIP). The algorithms of these metrics are different but they all use the number of papers and the number of citations to these papers as input. Besides the algorithm itself, there are two main differences with the approach of the IF that first we use the document types Article, Review and Conference Paper both as the source of citations and as citable items (article type consistency) and second we use a citation window of three years instead of two years in the IF. I will come back to the issue of article type consistency later. We find that the three year citation window is the most fair compromise for catching the citation peak in all different subject fields represented in Scopus.

IPP is the most simple journal metric using a citation window of three years and counting the number of citations from all Article, Review and Conference Papers to the same document types published in a particular journal, divided by the number of Article, Review and Conference Papers published in that journal. IPP is most analogous to the IF in depending on unweighted citation counting, but differs because it is a mathematical average rather than a ratio. SNIP is the field normalized version of the IPP in which the IPP is divided by the citation potential of that journal in its field. The citation potential and subject field is determined by the citation behavior of the journal itself. SJR is a prestige metric based on the idea that all citations are not created equal. Meaning that the weight of the citation received by the journal is dependent on where the citation is coming from. In effect, it says that a citation from an influential journal confers more influence on the cited journal. The result is an iterative, convergent calculation that is a distant relative of Google Page Rank. More details about these metrics and how these were developed can be found on http://www.journalmetrics.com. Each has its advantages, of course and that is why we use all of them and more to talk about the role a journal plays in the literature (the “basket of metrics” I mentioned in my previous comment).

That leads to the rather complicated question of “what is an ‘article’ really?” since “article” is in all the denominators and can make or break an Impact Factor [http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0030291]. My understanding is that Thomson Reuters uses an artisanal process based on expert review of the journal content and category-level assignment of document types (see: http://jama.jamanetwork.com/article.aspx?articleid=184527). We chose a fundamentally different approach that was available to us precisely because we were using the same content in the numerator as in that all-critical denominator (article type consistency). For the Scopus journal metrics “article” means the content classified in Scopus as Article, Review or Conference Paper. We use the classification by the original publisher for this in combination with general document type rules [http://www.elsevier.com/__data/assets/pdf_file/0007/69451/sc_content-coverage-guide_july-2014.pdf] While it could be argued there is some incentive for the publisher to falsely deflate that count by tagging inaccurately, should they attempt to re-direct some content away from the denominator, the numerator also loses the citations that content gets and the average remains focused.

We also find that the metrics differ depending on which data source you use. When applying the IF algorithm to the Scopus database (restricting to sources only covered in JCR) the actual values of the metric are much different from the IFs published by Thomson Reuters. We believe that the main reason for this is the question of what is considered an article or citable document by each of the databases. The two different approaches are not necessarily right or wrong, it just reconfirms the fact that these databases are constructed differently and therefore will result in different outcomes.

It is not necessary to resurface the entrenched arguments over each approach. They are analogous, but not alike. Users are free to choose the metrics that provide the insight they need for the task at hand.

Fascinating, Wim, and thank you. That the results differ greatly suggests that the IF is more of an artifact than a measurement. Perhaps we could call the two versions the TRIF and the SCIF.

do you have any sense as to how the two impact metrics compare, or differ? For example, do they count citable objects the same way? How does Scopus identify and count them?

Thank you for this question. Though you directed it at Wim, I hope you don’t mind me helping to differentiate between the two approaches. I have already spoken to the importance of consistency and selectivity in building all metrics in my earlier response. Consistency is not only how you select the content included in your calculations but, just as importantly, how you capture the content that arrives in different formats, with different coding, and different naming conventions. We begin the indexing of any new journal by creating an authority file, our own internal translation of what is coded by the publisher and it then dictates how we will tag, capture, and display the content in product. This process relies on a human expert to identify each section of a journal, create the appropriate document types for that section and make sure that we follow that convention consistently. In other words, we do not just rely on the publisher’s meta data to index the Web of Science or to identify document types used in the JCR calculations. This continues to be a major differentiator in the two approaches and is a major factor in the reliability of the JCR and InCites metrics. This is why the Impact Factor can’t be replicated with Scopus data.

The actual creation of JCR goes a few steps further. Before any calculations are conducted, we go through several processes to make sure all the data has been tagged correctly, all journal title variants are identified and unified, and we study closely what will be counted as a citable item. As you can see, many of our processes involve human mastery of “big data” computations. Another difference alluded to by Wim is that JCR is more inclusive—it gathers all citations to a journal regardless of citing or cited document type, whereas Elsevier’s product is restricted to citations from and to articles, review, and conference papers. JCR also includes citations that are clearly directed at a particular title, but which lack metadata to pinpoint what item in the journal is being cited. By not insisting on matching citations to individual items (but to journal title and year), the impact factor is ‘fault tolerant’ in the sense that it bypassed the difficulty of mismatches in volume, page, year, first author, etc., etc., introduced by author, editor, published (or even indexer). Every citation is an acknowledgement of influence, and JCR is better equipped to measure that because of its greater inclusivity.

I would also point out that while the formula for calculating the impact factor is simple and transparent and gets most of the research community’s attention, the Journal Citation Reports includes more than one metric. We offer short, medium and long-term analysis of journal performance in the JCR. We have the immediacy index, the Impact Factor, the Five-year Impact Factor, and the Citing and Cited half-lives. We have the Eigenfactor, which uses citation weighting based on the source of the citation and represents a similar approach as the SJR and Google Page Rank. We also offer subject-level normalizations for all of these metrics. I am happy for you to contact me offline if you have any further questions.

Thank you for allowing me this forum to respond. Thank you again Phil for bringing this issue to light.

Comments are closed.