Manuscripts are complicated. They start with a summary, continue with a justification for the work, and a comprehensive review of other work related to the topic. This introduction is followed by the methodology, arguably one of the most important parts of the paper. Once the “why” and “how” are explained, the paper lays out the results, discussion points, and conclusions.

File Stack and Magnifying Glass

Each section of the manuscript listed above helps to tell the story of why the author bothered to do the work, how they did it, and what they learned from it. Peer reviewers and editors are tasks with casting a critical eye on these parts:

  • Is the abstract clear?
  • Does the author explain the rationale for undertaking this work (novelty)?
  • Is the literature review complete? Does the literature review show a bias toward a specific conclusion?
  • Is the methodology sound? Can it be replicated?
  • Are the results reasonable given what the reviewer understands from the methods and introduction?
  • Are the conclusions consistent with the stated results?
  • Are the references complete and devoid of unnecessary puffery with citations to reputable works in the field?

Hold on — what’s that last one?

Reference lists. How closely are reviewers casting a critical eye on reference lists? Well, let’s take a step back and look at some opinions on citations in general.

There are some journals that limit the number of references an author can include in a paper. I never quite understood the point of this and assumed that the rationale had something to do with limiting page counts. It also seems like a useful exercise in figuring out that what you leave out is just as important as what you put in (a lesson taught to me by my freshman political science professor).

This practice of limiting citations does not always sit well with authors. Authors are often faced with requests from peer reviewers to add references to their paper. There are many reasons for this:

  • Omissions of important and relevant work from the literature review.
  • The author neglected to mention works that are contrary to what the author is presenting.
  • Peer reviewer takes advantage of their role and requests that the author add referenced to the reviewer’s works.
  • The reviewer is familiar with recently published and relevant papers that the author did not include.
  • The reviewer or editor suggests papers published in the journal in order to boost its Impact Factor.

What you don’t often see are reviewers asking an author to remove references. I understand from my friends in the humanities that reviewers do often heavily edit and review reference lists but I haven’t seen evidence (anecdote or otherwise) that this is a widespread practice in STM journals.

Arguments have been made regarding what a citation actually means. This discussion is coming to the forefront with the popularity of preprint servers. The question being asked is whether researchers should cite preprints that have not undergone peer review. I am not going to debate that here, but the next leap is whether journals should allow preprints to be cited and if so, should there be a clear indication that the paper has not been reviewed?

Is a citation a vote of approval for the cited work? Is it an acknowledgement of work contributing to the overall body of knowledge?

Did you know that there is an inflated importance of article citations? Of course you did. We talk about it all the time — in posts, in editorial board meetings, on op-ed pages, etc. We know that funding agencies use citations, that tenure committees use citations, and that authors in some countries are paid modest to huge bonuses for publishing papers in journals that have strong citation performances.

Recently, a debate started around the role that so-called “predatory journals” play in this vicious citation game. Bloomberg’s BusinessWeek published an article about pharmaceutical companies using Omics, a journal publisher sued by the Federal Trade Commission for deceiving researchers, to publish shoddy studies. After the article spends lots of inches describing the complaints and charges against Omics, it drops this bomb:

Bloomberg Businessweek found that researchers at major pharmaceutical companies, including AstraZeneca, Bristol-Myers Squibb, Gilead Sciences, and Merck, submit to Omics journals and participate in their conferences. Pfizer, the biggest U.S. drugmaker, has published at least 23 articles since 2011, including two since the FTC’s lawsuit.

The article goes on to posit that pharmas publish papers in these journals that won’t likely withstand the scrutiny of a quality medical journal. The pharmas benefit because the Omics journals are open access and they assume clinicians will have an easy time finding them via a Google search.

But there is something else going on? Again from the BusinessWeek article.

Jeffrey Curtis, a physician and professor at the University of Alabama at Birmingham, worked on a rheumatoid arthritis study with Bristol-Myers that was published in an Omics journal within two weeks of submission. Companies “are often in more of a hurry and are willing to accept lower-tier journals,” Curtis says. “They want a citation. They want someone to be able to reference it and have it be official.”

The pharmas are counting on other researchers to use the work and cite the work. Those researchers may publish in very reputable medical journals. So back to the question of what a citation actually means. Is it a vote of approval? These citations are often not flagged because predatory publishers, and notably Omics, owns many journals with titles that are almost identical to real journals.

The implications on modern medicine are huge; but, we are seeing this on a smaller scale as well.

Researchers that are living off citation metrics — untenured or soon to be unemployed or underfunded post-docs — are using citation metrics to prove their value. We even have an index for it, the h-index.

We see evidence that the h-index is gaining importance, with many scholars including it on their CV and on various applications. The h-index is a particularly nasty implementation as authors can game that system independently — no journal corroboration necessary.

I have recently discovered several papers where the authors are adding self-citations to a paper in review. The paper may have one or two self-citations in the original submission. A revision may come back with 15 more. And a second revision may have another 30 added. During the production process, still more may be added. The final published paper is now an h-index factory.

In these instances, I assume that the authors are counting on the editors and reviewers NOT reviewing the references after the initial review. Why would they? It’s not uncommon for a second review (if there is one) to be nothing more than a verification that requested changes were made. The editor and reviewers trust that the authors are not trying to sneak something past them.

I have recently discovered several papers where the authors are adding self-citations to a paper in review…The final published paper is now an h-index factory.

With a recent paper I reviewed from an ASCE journal, many of the added citations came from papers the author published in Omics journals as well as other well-known questionable journals. Even if the references were carefully scrutinized, would the reviewers recognize that Irrigation and Drainage Systems Engineering is not ASCE’s Journal of Irrigation and Draining Engineering? Would they notice that the Int. J. Hydraul. Eng. is not the same as the J. Hydraul. Eng.? Our incessant need to abbreviate journal titles in the references to within an inch of their lives is certainly not helping.

The gaming of the h-index lead one group of researchers to propose a new index—the s-index. The authors point out the following in the abstract:

Incorporating superfluous self-citations in one’s writings requires little effort, receives virtually no penalty, and can boost, albeit artificially, scholarly impact and visibility, which are both necessary for moving up the academic ladder.

I don’t disagree with this argument but the creation of a new index may serve to muddy the waters even more. The lead author, Justin Flatt, had trouble defining “superfluous self citations.” In an interview with Richard Poynder, Flatt agrees that the research community needs to form a consensus around acceptable levels of self-citations.

Phil Davis argues that the creating of a new index is not the best solution. He recommends that the h-index be coupled with an h-index minus self-citations. Think of it as an h-index and an h – s index. A wide gap between the two numbers would show that a large portion of a researcher h-index comes from self-citations. Knowing this may tell a reader whether this author is a bona fide legend or just a legend in their own mind.

Now that we have established the ways that researchers can inflate their h-index and capitalize on self-citations, we should talk about some solutions.

Critical review of the reference section is warranted. I looked at reviewer instructions across multiple publishers. Elsevier lists “references” as something that should be reviewed but it doesn’t say for what. PNAS and Wiley don’t mention references in their instructions. BMJ, Cell Press, and Taylor & Francis all recommend that reviewers ask if there are any glaring omissions of references but does not ask reviewers to review the references for quality or appropriateness.

I suggest that journals and editors consider the following steps in ensuring reference lists are helpful and appropriate:

  • Reviewers should be asked to look for gratuitous self-citations and ask the authors to justify the inclusion of those references in their rebuttals to reviewer comments.
  • Subsequent versions of reviewed articles should be evaluated for inappropriate references being added. This could be a reviewer task, editor task, or staff task.
  • References should be scanned for citations to known “predatory” journals. Once identified, an editor or reviewer can make a determination whether it’s appropriate to include. Conducting this scan would be labor intensive. It has been rumored that Cabell’s, who launched the journal blacklist earlier this year, is working on a tool for scanning reference lists for this purpose.
  • Let’s agree to stop abbreviating journal titles in references. The predatory publishers are taking advantage of journal brands and launching titles with small tweaks in the title. Abbreviating journal titles in the references abets that confusion as well as does a number on a journal’s search engine optimization.
  • Gratuitously citing your own work in inappropriate contexts should be considered an ethical issue and dealt with as such. COPE currently has no guidelines on this issue, but journals can take a leadership role in curbing this behavior.
  • Journals that discover gratuitous and inappropriate self citations after a paper is published should publish a correction noting that the identified references should not have been included as they are not relevant to the paper.
  • There should be zero tolerance for journal editors to insist on citations to the journal that are superfluous. Likewise, reviewers should not be permitted to provide a laundry list of their own works unless they are absolutely necessary for the paper.

If citations matter, then they matter. It seems that we, as a community of researchers and publishers, have determined that they do matter. Support for the Initiative for Open Citations seems to prove that point. If this content is valuable and the metrics around them used to make massive decisions about funding science and who gets promoted, etc., then we need to stop ignoring them and start casting a more critical eye on what’s going on there.

Discussion

10 Thoughts on "Turning a Critical Eye on Reference Lists"

An interesting read – I certainly hadn’t heard that some authors are using revisions as an excuse to sneak in self-citations. I have to say, that I think your suggestions as appropriate sanctions aren’t strong enough – for instance, if a journal discovers this only post-publication, then the paper should be withdrawn. As you say, it’s an ethical issue, and I suspect most other ethical issues wouldn’t be treated so lightly.

Apart from that, I wonder how many other poor citation practices have been enabled by reference software? One particular bugbear for me is that authors often cite an inappropriate secondary source (rather than either the original study, or an appropriate review), yet this is often hard to pick up at review.

Jake, I pondered retraction when the first instance came to light. I chickened out because there appeared to be zero precedent for that. There are guidelines about authors having incomplete references. Certainly leaving off a reference on purpose because it blows holes in your case is questionable. There are not COPE Guidelines or other ethical guidelines that really address gratuitous self citation. There is also a question of whether the unnecessary citations negatively impact the science presented in the paper. This gets stickier when predatory journals are being cited. Anyhow, a case could be made to retract the paper but at a minimum, corrections should be done. Sadly, whether retracted or corrected, those citations will still count toward an author’s h-index.

We did recently retract a paper with gratuitous self-citations but there was a far more serious reason for the retraction. As far as I can tell, this may be the first retraction that mentions self-citation abuse.

I am not familiar with reference software–I mean, I know what they are but I don’t know what the functionality is around importing reference lists. That may be something to explore.

Thanks for an interesting and timely analysis. At ScienceOpen we support researchers in exploring reference lists by opening them up to a variety of sorting and searching criteria: sort by citation number, Altmetric score, date, usage or filter by Open Access status, author, affiliation, and more. In a paper with a long reference list this can help users to get a quick overview, because, as you note, it can be quite tedious to critically tackle such a list. Here is an example from BioMed Central with 100 references:
https://www.scienceopen.com/document?vid=dc8c80f8-202b-44fe-899a-9218267d4ccf . Researchers can also export citations and of course we welcome critical analysis on the reference list via our post-publication peer review where we ask the questions: Do the authors reference the appropriate scholarly context? Do the authors provide or cite all information to follow their findings or argumentation? Do they cite the all relevant publications in the field?

I see the reference list you are referring to but I don’t see the functionality you mention. Exporting references is one of the issues raised here. Do those tools make it easier for an author to basically swipe a reference list (or in the case here with sorting, swipe the highly cited references) and use it in a new paper without actually reading those papers themselves. Kind of like how cut-and-paste technology makes plagiarism so much easier.

You post-publication peer review platform is really way too late in the process for this critical review. Once the references are in the published paper, they are doing the job as intended. Having a review on your platform calling out gratuitous references does not really solve the problem, but it may bring awareness to the issue.

Just click on the “100 references” and it opens a search interface where the users can drill down and explore the list: http://bit.ly/2yK0DRefs .
We do try to make it easy for authors to add citation data into their reference manager, but of course we support good science. When our database picks up the same title with the same typo multiple times it is clear that a lot of copy and paste happens regardless.
And I do hope that the peer pressure of having colleagues easily notice that you lave a long list of self-citations may be a factor in slowing down this process, Even if post-pub peer review is late in the process, it can still be a powerful motivator.

As when Phil Davis recently wrote about self-citations, I wonder what people’s thoughts are on how much self citation is too much? I’ve thought that more than about 10-15% is likely self-serving. However, in “A macro study of self-citation” Dag Aksnes reported averages of 17-31% average self-citation across disciplines, with chemistry and astrophysics being most enthralled with their own works.

There’s self-citation and then just plain old sloppy citation. The writings of the commenter above, Ole Bjørn Rekdal, are a great read on this, such as Academic Citation Practice: A Sinking Sheep?  In Academic Urban Legends (paywalled), Rekdal mentions “‘Lazy author syndrome’: throwing a few keywords into a database to come up with an impressive list of references which at first glance cannot easily be exposed as secondary, irrelevant, unreliable, or sources not even read by the author.”

Excellent post. Your main point that reviewers, editors, and publishers need to take a critical eye to reference lists is well taken.

One of my favorite kinds of sessions at publishing conference is when early career researchers are on a panel. It is one of the single greatest mutual learning experiences you can get. Time management is always a challenge for the ECRs. They complain about not being able to keep up with the new research and how they try to manage this will curation tools. There are always conflicting looks of horror when someone from the audience inevitably ask whether the read all the papers they cite. It’s almost like they are saying yes, while nodding no. I get it.

Regarding acceptable self-citation rates, that’s really very difficult to nail down. If someone spent their career researching a fairly specific topic, they may very well be one of the authorities and if that topic comes from a small community of researchers with loads of collaboration, then it seems reasonable that said author would appear on many bylines. But, do all the papers need to be cited in each and every subsequent paper? Unlikely. I do think that a subject matter expert needs to be the one to determine whether there are unnecessary citations.

Among the multitude of new citation metrics researchers have proposed over the last few years, I recall there was at least one where citations where weighted based on whether the actual cited article also had a large number of citations or not. Computationally this is more difficult, but coupled with removal of self-citations it might be much harder to game.

Such an approach would also provide a better idea of “impact” than a raw citation count. Compare article X, cited 20 times, where each of those citing articles was on average cited 10 times, with article Y, cited 20 times, but those citing articles on average cited only once. A multi-tiered metric like this indicates article X is more “impactful” than article Y, and would make it harder for citations in Omics journals and the like to bolster individual article metrics. It also takes longer to accumulate though, as you need to wait for not one, but two rounds of publication/citation to get useful numbers (the need for immediacy is why journal-level impacts are used as a more immediate proxy, but of course a more complicated metric like that described in this comment would work just as well at the journal level as at the article level)

In case someone is tempted to comment that the whole focus on “metric-isation” is unhelpful and a different approach is needed, I mostly agree; though metrics have their uses, and if they are being used already, it certainly doesn’t hurt to try to use better ones.

Comments are closed.