Toll House cookies cooling on baking sheet
Image via Wikipedia

What does it mean when you claim a journal is peer reviewed? What does it mean when you say an article was peer reviewed?

Peer review is a major signal of quality, yet even major brokers in it hesitate to describe it beyond the fact that it can be achieved by sending out manuscripts for review by peers.

Peer review is a tool, not a standard. And not all tools are made of the same stuff.

But the label of “peer review” can be used in ways that sometimes seem downright misleading — one version is not like another, and some forms strike me as so cursory or ill-managed that they hardly qualify for the description. Yet, we never get beyond those magical two words “peer review,” so we never know what it really consists of.

One very down-to-earth editor compared peer review to chocolate chip cookie recipes — everyone has one, but some are better than others, and some bakers make great cookies simply using the recipe on the bag.

It’s partially the process, partially the ingredients.

I think it’s time we stop allowing “peer review” to be used as if it’s an immutable standard, as if it’s uniformly deployed and consists of the same ingredients every time.

We could improve peer review immensely simply by describing it in the same factual way we describe studies — with qualifiers like double-blind, randomized, placebo-controlled — to differentiate how various practitioners accomplish it. Then, we can better assess how well it’s being done, once we know which aspects of the process are being used.

After all, each journal has its own version of peer review, and there can be versions of peer review within journals (editorials and commentary and review articles are reviewed using a different process than is used for research papers, for instance).

Innovations in content forms can create the need for new peer review approaches — for instance, editorial staffs have to invent ways to peer review videos, interactive educational exercises, or animations. Yet, if we don’t reveal these new processes somehow, users and readers are left with just “white label” peer review.

By merely slapping “peer reviewed” on our journals, we’re stating a fact but obscuring some important and potentially helpful information. How many people participated? How long did it take? It’s all good information.

And it’s not that hard to describe.

Here are some potential categories I’d like to see:

  • Number of outside reviewers
  • Degree of blindedness (names and institutions eliminated from the manuscript, for instance)
  • Number of review cycles needed before publication
  • Duration of the peer review portion of editorial review
  • Other review elements included (technical reviews, patent reviews, etc.)
  • Editorial board review
  • Editorial advisers review
  • Statistical review
  • Safety review
  • Ethics and informed consent review

We have disclosure statements to deal with commercial influence. Why not a simple statement of peer review like:

This paper was peer reviewed by 3 outside reviewers blinded to the authors’ names but not institutions; it required 2 iterations of review prior to acceptance, and peer review required 14 weeks from start to finish; in addition to outside peer review, the paper was reviewed by a panel of editorial advisers, two statisticians, and a patient safety expert.

Simple statements like this would force shops that peddle sloppy peer review to state how they’re accomplishing it, how long it took, and a number of other helpful facts. And it would make them state their practices publicly and risk being discovered if they fudge the truth. It could end the abuse of peer review as a label some journals hide behind.

Imagine reading a paper that stated something like this:

This paper was peer reviewed by 1 outside reviewer not blinded to the authors’ names or institutions; it required 1 iteration of review prior to acceptance, and peer review required 1 week from start to finish.

Compare that to a grueling process for an ambitious study with surprising findings:

This paper was peer reviewed by 13 outside reviewers blinded to the authors’ names and institutions; it required 6 iterations of review prior to acceptance, and peer review lasted 43 weeks from start to finish; in addition to outside peer review, the paper was reviewed by this journal’s editorial board, a panel of expert advisers, technical reviewers, a patent reviewer, and a legal expert familiar with law in this area.

While you could infer from the second example that the paper in question was dicier in some manner, science is supposed to be dicey sometimes, and clearing a high bar shows the authors’ commitment, the journal’s high standards, and the dimensions that were probed. Readers might be curious to know how hard everyone worked to get the paper right, and it might provide the authors, reviewers, editors, and publishers with recognition for the amount of labor that can go into a difficult but worthwhile paper.

Right now, we rely on journal brands to imply levels of care, but these brands can change their processes or assert processes they don’t follow. As readers, we never know. And we certainly don’t know at the article level. Opaque labels like “peer reviewed” or a journal’s brand don’t tell us what we expect to know about other important aspects of a particular paper, like study design, author disclosures, materials and methods, or affiliations. So why do we accept such substitutions for peer review descriptions?

Peer review shouldn’t be sold without an ingredients list.

It’s simply too easy to claim something is “peer reviewed,” and there can be downsides to the credibility of scholarly publishing overall if we continue to allow this label to mask differences in approach, capabilities, and process.

Unlike some, I don’t think “open” peer review is the answer. There are many good and compelling reasons to keep peer review a private editorial process — not only does it allow for the necessary negotiations between journals, authors, and reviewers to be conducted in a safe environment, it also lets rejected authors save face, learn from reviews, and resubmit elsewhere without a blood trail.

At the same time, there’s no reason not to boast about an excellent approach to peer review. Top-tier journals sweat peer review, and review in general. But because the term is opaque and undifferentiated, they’re being lumped together with lesser peer review approaches. That doesn’t seem fair, and it certainly doesn’t help them in the micro sense, or science in the macro sense.

I say, describe it, be proud of it, and let the others try to match you.

Let’s describe peer review just as we describe funding, study designs, and methods. It’s integral to scientific integrity, misleading on many levels merely as a label, and a differentiator that should be unpacked.

At least readers would know which recipe was used and what the ingredients were.

Reblog this post [with Zemanta]
Kent Anderson

Kent Anderson

Kent Anderson is the CEO of RedLink and RedLink Network, a past-President of SSP, and the founder of the Scholarly Kitchen. He has worked as Publisher at AAAS/Science, CEO/Publisher of JBJS, Inc., a publishing executive at the Massachusetts Medical Society, Publishing Director of the New England Journal of Medicine, and Director of Medical Journals at the American Academy of Pediatrics. Opinions on social media or blogs are his own.

Discussion

31 Thoughts on "Improving Peer Review: Let’s Provide an Ingredients List for Our Readers"

Before I criticize, I would like to point out that the thrust of your argument makes sense, but to continue the metaphor, I think the details remain somewhat ‘half-baked’.

I agree that not all publishers conduct the same level peer review for papers, and you quite rightly point that out. But I think you neglect to acknowledge that not all published research is the same either. I can think of plenty of peer reviewed scholarly communication forms where

3 outside reviewers blinded to the authors’ names but not institutions; it required 2 iterations of review prior to acceptance, and peer review required 14 weeks from start to finish; in addition to outside peer review, the paper was reviewed by a panel of editorial advisers, two statisticians, and a patient safety expert.

would be far from the gold standard for peer review. Setting out a schema for ranking peer review without this distinction drags us down into a contest where certain types of paper are given a significant (and perhaps unfair) advantage over others.

Your criticism makes no sense to me, frankly. How can you know what the comparisons will be like if no information is being disclosed? Describing peer review would allow others to track, compare, and ultimately determine a norm for journals by field. Internally, journals know how many reviewers it normally takes, what the average turnaround time is, and the range of effort for papers that go through without a problem vs. those that take extra effort. Why not know these things as a reader? Across journals?

Nobody knows what a “gold standard” for peer review is because nobody can see it across journals. Each paper is different, but how different is the peer review? Does some aspect of it, measurable from the statements I’ve dreamed up, correlate to impact factor? To general prestige? Or not?

Right now, we’re working on assumptions and trusting that various black boxes are functioning more or less the same. Why not get some measure of the machinery inside to discern more?

Perhaps I was too harsh in my initial comment. All I am putting forward is the suggestion that publishing numbers removes the subjective nature of peer review. Some papers will need to be reviewed more quickly, and publishing the numbers without the context may lead to unfair comparison.

Understood. However, in my experience, good peer review journals accomplish robust peer review on short schedules when needed. Speed doesn’t necessarily mean fewer reviewers. In fact, imagine if you had a statement like, “This paper was reviewed by 5 outside reviewers blinded to the authors’ names and institutions; it required 2 iterations, and peer review required 3 days from start to finish.”

I think that would be pretty impressive.

A relevant blog post: EMBO journal introduces transparent peer-review, Maxine Clarke, Peer-to-Peer, January 5, 2009 (http://bit.ly/EZE2k ). Excerpt: “Beginning with manuscripts submitted in 2009, a supplementary process file will be included with the online publication of papers. This file will show all dates relevant to manuscript processing and communications between the author, editors, referees and comments to the decision letter.”

Kent,
You make a strong argument for disclosing the details of the process – the “how” of peer-review.

Equally important to the peer-review process is the “who,” although it is difficult to reconcile reviewer disclosure in a blinded process.

Peer-review is ultimately limited by the knowledge and commitment of one’s peers to improving a piece of research. For that reason, there is only so far an ingredients list can go.

Would it be possible to *describe* the reviewers’ qualifications, without disclosing? After all, Entertainment Weekly manages to do this without disclosing identities (“…a long-time producer of serious dramas…”) when asking movie industry personnel to handicap the Oscars.

Would it help you to know, say, that the immunology paper on T-cells that you just read was peer reviewed by 3 scientists who are immunologists who work on T-cells?

Perhaps publishing the Journal’s averages would be more relevant/helpful. Our Journal already publishes the average time it takes to review a manuscript and the average number of reviewers involved in the evaluation of each manuscript in our instructions for authors – and making it more available to the readers by putting it in the actual manuscript might be a good idea. However, I agree with Mr. Anderson that neither the quantity of reviewers nor the time it takes to complete reviews equal a quality review process, and providing such info implies that there is value in that info… who is sometimes more important than time or quantity, and yet I do not think that it is helpful to reveal who because authors may seek out reviewers who have recommended acceptance for other manuscripts that are related to the type of treatment or school of thought presented in their own manuscripts … at least for any Journal who allows their authors to suggest reviewers.

Is there a specific problem you’re trying to solve here Kent? Are readers being fooled by unscrupulous journals that practice poor quality peer review? Wouldn’t something like that be evident in the quality of the articles published? Wouldn’t a poor review process result in poor quality papers being published, rife with errors and unproven conclusions? Readers are pretty good at figuring out which journals publish quality work and which journals are churning out garbage.

That said, I do think that detailing the peer review process in the author instructions might be helpful for authors, letting them understand what they’re going to be going through.

I’m not sure many of your suggestions would be particularly meaningful though. What does the duration of the review process say about the quality of the process or the paper? It often takes me a month to find a reviewer willing to take on a particular paper. Does that mean the paper is better, or worse than a paper where I find a reviewer right away? Some reviewers look at the paper immediately, others wait until a few days after their deadline. Again, does this say anything about the quality of the paper or the review?

The same goes for the number of iterations of the paper. A simple paper that’s clear and well-written may have one iteration, while a paper that’s much more interesting and groundbreaking may require several iterations to iron out the kinks. Does this say anything to the reader about the quality of the work in either? Or the review process?

As for double-blinded reviews, the idea is often raised, but realistically, it’s pretty easy to tell who has written a paper if you know your field well, and if you look at the paper’s citations (most authors are going to cite their own previous work). If the editor does a good job and gets the right reviewer for a paper, someone who knows the field well, then the identity of the author is going to be obvious to them.

The problem I’m trying to solve has to do with journals claiming peer review who do it poorly or thinly, forcing disclosure of what is a crucial differentiator, and allowing important information to be clearly delineated for readers.

This comment took 32 seconds to write and I only read 63% of David’s comment before responding.

As I asked, is it really that hard to tell when a journal is doing a bad job of quality control? I am a big supporter of transparency, but I also think that disclosing lots of meaningless information just creates more noise, rather than honing the signal.

And what do your 32 seconds and 63% figures say about the quality of your response? If you are inherently brilliant, that seems like more than enough time to respond. If you’re a dope, a few hours contemplating might have been advisable.

Perhaps we’re better off judging the quality of an article (or a response) by reading and assessing the actual quality of that article/response, rather than trying to do so from a bunch of random statistics about the production process.

You could ask the same thing about conflict of interest. Is it really that hard for insiders to judge when an author has a conflict? Probably not, and probably never has been. But we need disclosure to reveal something about the relative degrees of conflict and to create pressure on authors. I think a similar thing can be said for peer review, but it applies to journals. Insiders can probably tell when quality control is good, but they may not know what occurred to reach that point, or what steps were used relative to other similar studies they might find elsewhere. For instance, assume you had two studies with similar designs from relatively prestigious journals that somehow reached divergent conclusions. Right now, you’d say, “Well, they were both peer reviewed, so that’s equivalent.” With descriptions, you might look at one and note that it included 2 outside reviewers, 2 statistical reviewers, and an editorial panel discussion and required 3 rounds of review while the other only had 2 outside reviewers and was accepted after the first pass. You might think the first paper’s results to be a little more robust since the version of peer review they cleared was more robust. It might actually help you lean in one direction or another.

Or, what if you see a study with results you think are highly questionable and wonder exactly how it got into the journal you’re looking at (don’t tell me this has never happened). You might read the note about how the peer review occurred with great interest, and use that information in a letter to the editor. Right now, you don’t have that option.

Or, you might want to compare studies across journals to see if studies that used statistical reviewers were more highly cited than those that didn’t.

Or you might want to see if studies that originated from non-English speaking countries required more iterations of peer review or didn’t.

I think there could be specific instances the information is useful, interesting, or relevant, and general trends publication of such descriptions could help us identify.

Kent’s post is a favorite topic of mine. I think the “problem he is trying to solve” is that researchers and so on have less time then ever to read more information then ever. Some method by which they could weed out the “peer review lite” material as opposed to the higher quality peer reviewed material would be incredibly valuable. I think a method, such as David Smith mentions, that would “codify” this is very interesting and something I personally have given a lot of thought to.

Do the things Kent has mentioned here, time of review, number of reviewers, etc. really guarantee quality review? As someone noted below, each peer review process for each paper is a singular, unique event. The numbers tell you nothing about how rigorous and accurate the review has been performed. One expert rigorous peer reviewer who spends one day on a paper can be vastly more valuable than 100 shoddy reviewers who spend weeks glossing over errors.

To stretch a metaphor a bit, I think it’s like the difference between the weather and the climate… On any given article, the metrics Kent talks about may not be particularly useful, but show that same data in context (whatever that may be) and suddenly interesting things may be seen. What extra or different things would you like to see measured?

I’m not sure it’s an easily quantifiable process. Yes, it would be nice to see each journal’s process explicitly stated (though even this can vary paper to paper). That might help judging a journal as a whole, but I’m not sure if it tells you anything about a given article.

The only thing I can think of that would be of value would be knowing the identity of the reviewers, seeing their comments to the author and the editor and seeing the author’s responses to those comments. But that kind of destroys the whole concept of anonymous peer review.

Kent–starting a new thread to avoid things getting squashed down…

I think the questions you’re asking are valid and interesting. But they’re the sorts of questions that interest someone in publishing, or someone studying the publishing of science/the culture of science, not an actual scientist/reader. For the scientist, the key question is, and must be, is this paper accurate, do the data support the conclusions?

I was trained to read scientific papers skeptically. I don’t believe a single word any author says unless the data presented backs it up. This has to be the approach taken for a reader, regardless of the journal or the peer review process involved. Any scientist who believes that an article in journal X has to be accurate because it’s in the high quality journal X should probably consider a different career. The question of “how did such a crap article end up in a good journal” is always good water cooler fodder, but has little impact beyond that. What matters is carefully reading it and realizing that it is crap.

Because peer review is qualitative, the quantitative data you’re asking for here is not informative. Let me give you two scenarios where two papers are published looking at the same phenomenon but draw different conclusions:

Scenario 1
Paper 1 was peer reviewed by 5 experts in the field, no revisions were required and the whole process took 2 weeks. Paper 2 was peer reviewed by 1 researcher who demanded several series of additional experiments, it went through 6 iterations and the process took 9 months. In this case, paper 1 was a work of genius, the experiments were perfect, the writing was superb, the conclusions were supported 100% by the data. Paper 2 was written by a non-expert in the field, the reviewer took the authors to task, made them run actual controls and was still never satisfied with the conclusions, but got tired of going through the paper over and over again and just gave in.

Scenario 2
Paper 1 was peer reviewed by 5 experts in the field, no revisions were required and the whole process took 2 weeks. Paper 2 was peer reviewed by 1 researcher who demanded several series of additional experiments, it went through 6 iterations and the process took 9 months. In this case, paper 1 was written by a big name in the field, and was rubber-stamped by 5 of his cronies who didn’t really read the paper. Paper 2 was reviewed by an expert in the field who knew what the author of paper 1 was going to report, so he insisted on an extremely rigorous review process, extra controls, and complete confirmation as he knew the author was going to have to face a lot of questions from the big-name author who was in the wrong here.

How do you tell the difference between the two scenarios using your suggested reporting data? The exact same data can be indicative of high quality or low quality, and without context, the numbers are meaningless.

I think the biggest effect this could have is on the macro level — demanding that journals describe their process for each paper. It won’t substitute for critical thinking, but could add information to aid it. On a cost:benefit basis, it makes sense — the cost is little (write a description), but the benefits on a paper-by-paper basis and a macro level could be significant. So, we could debate until we arrive at a perfect solution, or we could move a beneficial approach forward and refine it as we go. I feel this is better than what we have now (silence and assumptions about equivalency).

I still don’t see what you would gain from knowing that a paper took 3 weeks to review or 4 weeks. Either the reviewers did a good job or they didn’t. The proof is in the pudding, in the paper itself.

And as I’ve said repeatedly, the criteria you’re using don’t tell you anything positive or negative about the paper. Is a paper that went through 2 revisions better than a paper that went through 5 revisions? Which is more trustworthy, the paper that was accepted more quickly or the one where lots of additional work was done to improve it? How would you ever know?

It won’t matter for every paper, but it would add some information to every paper. What if you thought the statistics were crap, but you saw that it was reviewed by 2 statistical reviewers. Good information? Changes your thinking for an instant? Adds to the paper? Again, it’s simple to add, can matter in some cases or to some minds, and doesn’t detract. On a macro level, it could have the salutary effect of making every journal disclose how it accomplished peer review, raising the bar generally. Nothing wrong with that.

I’m not sure it changes anything–either the paper is right or wrong. Sometimes the editor does a bad job of choosing peer reviewers, sometimes the peer reviewers do a bad job of review. Even statisticians can be wrong, and if you’re reading the literature with the proper level of skepticism, the reviewers have the same burden of proof as the author.

I’m not sure how much pressure would be applied if all journals disclosed on a macro level. If my journal uses 3 peer reviewers and a competitor uses 4 reviewers, would I feel obliged to add an extra reviewer?

Peer review also gets waved around as the magical thing that will prevent “bad” or “biased” manuscripts from reaching print — but that’s not always possible: bad science, if it fits into the current theories about how the world works _can_ get published… While we are discussing making the process and the people more visible, perhaps there can also be a discussion of what peer reviewers themselves believe their roles as readers to be.

This would be a data goldmine if it were moved out beyond a written statment and instead (or as well as) codified in a machine readable way. Then we could look at peer review vs the citation metrics, or look at which areas of a particular branch of research are struggling to get through peer review. You could look at author dynamics at the article level (double edged sword that especially if tied to funding…) I wonder what the climate debate would look like if all the climate papers had this metadata exposed this way for visualisation – the poor papers in the IPCC report would stand out clearly to the benfit of all I think. Think of the value of this data out beyond our direct communities. I would LOVE to see what people could do with this sort of information and I think it would benfit the perception of the scholarly process for the the world at large.

Beyond studying how science is done, and how science is published, the other great use for this sort of information is internal quality assurance at a journal. Correlate the citation rate, errata issued and papers withdrawn with the details of the peer review process and you might come up with ways to improve what you’re publishing. It’s also perhaps a good way to measure an editor’s performance in the peer review process.

I agree that peer review is not one process and that it could make sense to describe exactly what process has been applied for any article.

The implication of your piece, however, is that more process–more reviewers, more blinding, etc– is better, and I don’t think that you have any evidence to support that. It fascinates me how most debates about peer review are conducted without any reference to the now substantial evidence base on peer review. That evidence base finds that peer review is slow, expensive, poor at detecting errors, largely a lottery, prone to bias and abuse, and anti-innovatory. There is very little evidence of an upside, although people continue to believe passionately in peer review–making it, I’d argue, a “faith based process.”

The “real peer review,” which determines the value of a piece of work, happens in the market of ideas once a piece is published. My position is that the sooner we get to that and the less time we waste on “prepublication peer review” the better.

The implication is that “more is better” is not what I got out of Kent’s post or many of the responses. The problem being pointed out is that there is a lot of published research claiming to be “peer reviewed” is actually, to use your own words, work which has gone through a system that is “…poor at detecting errors, largely a lottery, prone to bias and abuse, and anti-innovatory. There is very little evidence of an upside, although people continue to believe passionately in peer review–making it, I’d argue, a “faith based process.” BTW, I do not entirely agree with your negative assessment of the peer review process. It is not perfect, but when done right by good editors and good reviewers, it is pretty darn good IMO.

What Kent and others are suggesting (I think?) is some type of tool that will allow potential readers to sift through the enormous number of articles being published.

“More” does not necessarily equal better. A paper reviewed by “1 Kent 2 times” certainly would have undergone higher quality peer review then had it been reviewed by “3 Adams 3 times.” 😉 Some method to quantify that could be very useful.

Comments are closed.