Magnifying glassBack in late 1998, I thought myself clever by launching “P3R,” or “post-publication peer-review” at Pediatrics. It was our attempt at e-letters, and P3R was the marketing spin I decided we’d try. It worked pretty well, actually, with one memorable statistical correction coming from a grocery store clerk, and more (and better) e-letters coming in than we expected. But over the intervening years, I’ve come to realize that marketing exaggeration and hard reality probably don’t meet on this one.

We had online letters. We didn’t really have post-publication peer-review.

Peer-review is a special animal with an erratic history. First used by the Royal Society of Edinburgh in 1731, where individuals “most versed in . . . matters” were consulted before collections of medical articles were published, it fell out of favor in the 1800s as an era of editor-centric letters journals came to the fore. As scientific publishing grew in size and complexity, peer-review was reintroduced to deal with the onslaught of papers, especially after World War II.

Peer-review has such a key role in the social goals of science that it is governed and protected by laws, some federal, some state. In medicine, peer-review within hospitals is also regulated by the Joint Commission on Accreditation of Healthcare Organizations (JCAHO). There are several ways peer-review is validly accomplished according to these bodies, with the best approach being double-blind peer-review (neither the author or reviewer finds out the other’s identity, as opposed to single-blind peer-review, in which the reviewer knows the identity of the author):

  1. The review must be objective and comprehensive. The people doing the review must be in a position to render a fair and unbiased opinion, and must look at all sides of a case undergoing review.
  2. The reviewer must be a true peer. This can be a case-by-case judgment, but the principle applies.
  3. The review must be uniform in nature. That is, if someone is called out because of a potential bias, then all others suffering from the same or similar biases must be called out.
  4. The review is usually confidential, which is why it’s afforded protections under US law.
  5. A peer-review committee or group must be defined in some manner.

Now, not all of these rules are applied in the same manner to health organizations, legal peer-review systems, or scientific journals broadly speaking. Health organizations work differently, legal scholars and bodies work differently, and journals work differently still. These differences only underscore how careful we have to be with terms. Even Wikipedia distinguishes its peer-review process from academic peer-review, noting that articles undergoing academic peer-review should be presumed to have greater authority. What JCAHO considers peer-review is slightly different from what journals consider peer-review.

Between journals, there’s a lot of variability when we talk of peer-review. Most journals strive to have at least two independent reviewers evaluate each manuscript chosen for peer-review, but the level of review required, the disclosures required of reviewers, the depth of review, the grading of reviews and reviewers, and the time given for review can vary between and, in some cases, within journals, depending on the speed required and the type of article.

Yet, key constants remain, especially the desire to select true peers to do the review, to have objective reviews (often leading to some form of blinding), and to have a uniformity of reviews and reviewers, usually through a structured system of training, grading, and ranking.

When we move to the system currently being called post-publication peer-review, we leave many aspects of peer-review behind:

  • The authors of the paper are known to the reviewers.
  • The identity of the reviewers are disclosed to the authors and to subsequent reviewers.
  • The reviewers are not pre-qualified as true peers.
  • The reviewers are not identified as part of any formal committee or peer-review group.
  • There is no uniform ranking or grading system.

There is evidence that peer-review systems devoid of these special characteristics fail. The well-known trial of open peer-review, done by Nature in 2006, studied the use of an open approach for a large multi-disciplinary high-status journal. Over the course of four months, 71 articles were posted for open comment; of these, 33 received no comments at all, while the level of commenting on the other articles proved scant and unhelpful. One difficulty was in getting substantive comments from experts in the area (peers). As one paper reviewing the findings stated it:

It is simply unrealistic to expect informed, well-argued opinions from those who have not been specifically tasked with the job of supplying them.

With open pre-publication peer-review failing so dramatically, why are we holding out hope for an even more exposed, less incentivized system of post-publication peer-review?

For us to truly enact “post-publication peer-review,” it seems we need to actively build out that exact capability. It’s not impossible, but we can’t lazily pass off comments and letters as the same things as “post-publication peer-review,” not without devaluing peer-review overall. As I see it, a system that would qualify as post-publication peer-review would have the following features:

  1. Reviewers would apply to be a post-publication peer-reviewer, have her or his application approved by the editor, and then be subject to viewing contents of the site without identifying characteristics (author names, institutional and other sponsors, etc.).
  2. The reviewer could request his or her identity be suppressed on the review, but this could be optional, and dependent on the substance or direction of the review.
  3. Post-publication reviews would be clearly distinguished from reader comments or letters to the editor, both of which have a place, but are not technically post-publication peer-review.

Even if we were to create such a system, the issue of incentives raises its big, ugly head — incentives are always present in science and scientific publishing; they’re what make our world go around. What are the incentives for someone to write a post-publication peer-review? As someone noted at a meeting I attended last week, the more effective and incentive-driven response to research you don’t believe to be correct or sufficient is to conduct and publish a better study — you get academic credit for this, which is precisely in keeping with the incentive systems we have in place. But putting your name to a criticism that might tip off another scientist about a great new study to perform? Not a smart move.

It’s interesting to contemplate how our current implementations of what some are calling “post-publication peer-review” can lead to what is known as “sham” peer-review. As defined by Wikipedia, sham peer-review is:

. . . the abuse of a medical peer review process to attack a doctor for personal or other non-medical reasons

Today’s commentators seem to have many axes to grind. Far too often, commentary forums degrade into polemical attacks with win or lose dynamics at their heart. The pursuit of knowledge and science isn’t the goal. Capitulation of one combatant to another is.

Peer-review is at the heart of scientific communication and validation. It’s not perfect, and it has many forms and shadings within those forms. However, attempts to market comments as post-publication peer-review — as something akin to true, single- or double-blinded peer-review by true peers — seems doomed to failure. In fact, there is every indication that it is failing.

Perhaps it’s failing because it’s a failure.

Enhanced by Zemanta
Kent Anderson

Kent Anderson

Kent Anderson is the CEO of RedLink and RedLink Network, a past-President of SSP, and the founder of the Scholarly Kitchen. He has worked as Publisher at AAAS/Science, CEO/Publisher of JBJS, Inc., a publishing executive at the Massachusetts Medical Society, Publishing Director of the New England Journal of Medicine, and Director of Medical Journals at the American Academy of Pediatrics. Opinions on social media or blogs are his own.

Discussion

28 Thoughts on "The Problems With Calling Comments "Post-Publication Peer-Review""

Very interesting. I had no idea that there are actual laws regarding peer-review. It’s hard to see how such a thing could possibly be enforced, especially as nearly all journals use international reviewers.

I certainly agree that blog comments and similar are not at all equivalent to to what we usually mean by “peer review”. On the other hand, I’m not convinced that all the aspects of traditional peer-review are particularly positive. The asymmetry of anonymity is particularly troubling: reviewers know who authors are but not vice versa. One fix, as you suggest, a bidirectional anonymity. That might work for some fields, but not for others. For example, every time I get a sauropod paper to review, the limited population of workers in this field makes it pretty clear who the author is even before I see the name. So in those fields, at least, the better option is to go the other way: get everything out in the open. Authors and reviewers know who each other are, and the reviews themselves are published. (I know this is one of several concepts that has been described by the term “open peer-review” — a term that unfortunately may be too ambiguous to use.)

You’re right that incentives are the big issue here. I think that whether peer-review is done before or after publication, it will still need to be explicitly solicited by editors, at least for a while yet — until we get very different network of incentives. I am another one of those who in theory approves of fully open opt-in review, but who would never get around to actually volunteering.

What is needed here is a clear understanding of the goals of peer review, not just the forms. If the goal is acceptance for publication, versus rejection, as it often is, then post pub peer review seems like an oxymoron. If the goal is simply to gather ideas or spot issues or identify weaknesses then rigor may well be unnecessary. Commenting works well in the consumer product domain.

If a system is not designed to do what you want it to do then it usually won’t do it. Post pub peer review is no exception.

I do not think that the goal of peer-review is to decide acceptance or rejection any more, at least not in the domains I work in or I know other people working in. The goal is rather to decide in which journal to publish: most serious papers can make it to publication (by serious I mostly mean rigorous and new), and some non-serious ones make it too. As far as I know, to have a cruder filter before deciding to publish, and to use post-publication peer review to assert the importance of the paper was meant to avoid the burden of papers being reviewed several times in decreasingly important journals.

At the journals I’m most familiar with, there are layers of peer-review — first, a review for appropriateness and interest and observable quality, really a “pecking order” review, which cuts 20-60% of papers from the desk; second, a blinded external review process, which tests the quality of interesting and relevant papers, and culls another 20-60%; finally, statistical and technical review, which may trim another 5-15%. Ultimately, the only decision that matters is acceptance or rejection.

The reason all these layers work is incentives. The only incentive in science I see working after publication is further publication. Comments don’t usually count as publication events, letters to the editor are minor publication events; so, the best route is to conduct a better study and get it published if you think the study you’re reading is flawed or wrong.

I do not really understand the relevance of this comment as an answer to mine:

1. “Ultimately, the only decision that matters is acceptance or rejection.”
That may be the case for a given journal, not for an author (who can resubmit) or for the system as a whole (which will see resubmissions). Because a journal does not see a paper after rejection does not mean its story is over!

2. Concerning incentives: I fail to see the incentive of working on another’s paper instead of one’s own, regardless whether it is post- or pre-publication peer review, appart from the obvious ones (that’s part of the job, and an influencial editor asks you to). What difference does the timeline make in incentives?

You said the goal of peer-review isn’t to determine acceptance or rejection, but for each journal, that’s the decision that matters, and for the ranking through the pecking order, that’s the decision that matters. Of course it’s all about rejection or acceptance.

One of the incentives of working on another’s paper is early knowledge of research outcomes and study endpoints. There’s also the prestige. Both factor in.

Most papers find a home somewhere, but that home makes a difference to the author’s career, funding, and subsequent status. Acceptance or rejection matter.

Interestingly enough, Clay Shirky seems to agree with you and is now putting forth the idea that crowdsourcing results in hijacking and gamed systems:
http://news.change.org/stories/clay-shirky-argues-for-social-innovation-incubators

I think you’re right as far as the terminology–the sorts of things that are routinely called “post-publication peer review” should instead find some other handle (“post-publication discussion and metrics”?). I do think there’s still value in these things, but it is a different value than that provided by pre-publication peer review.

I suppose the real test of post-publication peer review will happen with F1000 Research where everything submitted will be published (after an initial “sanity check”) and reviewers will then be invited to perform a PLoS ONE level peer review (judging accuracy rather than significance). It will be interesting to see whether they can drive participation, and if so, how the community reacts to a journal that contains a significant amount of publications that fail to pass this review.

Yes, it’s a shame that a helpful metaphor (“post-publication peer-review”) seems to have solidified into an unhelpful definition.

BTW., I’d like to avoid the term “PLoS ONE level peer review”, since the difference between peer review at PLoS ONE and elsewhere is not one of degree, but of kind. So quality requirements there are as strong as in a typical journal, but impact requirements are simply not imposed. I know you know this, but for the sake of people who don’t yet get it, I think it’s better to avoid terms like “PLoS ONE-level”.

Mike, can you suggest a better term to use? We seem to have an endless supply of comments that continue to come in on older articles about PLoS ONE from people defending the rigor of PLoS ONE’s review process.

There’s a semantic misunderstanding here. PLoS ONE uses fewer criteria in their review process. Journal X reviews for accuracy and significance. PLoS ONE reviews for accuracy. Both review what they review at a high level of rigor (not surprising since they’re both drawing from the same pool of reviewers). But simply put, 1 + 1 = 2 and 2 is a bigger number than 1. So words that are often used to describe PLoS ONE’s review process are often words that are meant to reflect that smaller set of criteria (“peer review lite”) but unfortunately end up being taken as pejorative, as implying that the rigor of the review that is performed is less.

I used “PLoS ONE level peer review” as I was hoping it was a less judgmental statement, meaning review only for accuracy, not for significance. But apparently that was still too leading a term. Would “PLoS ONE style peer review” have been preferable?

What would you suggest instead? There must be some short and easy way to get across an accurate description of what a journal like PLoS ONE does in its review process. I don’t want to have to write an entire paragraph whenever it’s mentioned and I think we get bogged down in these sorts of semantic issues which distract from the point being made.

How about “PLoS ONE style editorial criteria”? The difference at PLoS ONE is not the peer review itself (which is done by peers who look for rigor & validity, submit their reviews to a volunteer editor who processes them, etc.), but the editorial decision process. Because “PLoS” is a loaded term for those on both ends of the PLoS-Love-o-Meter, something that removes the publisher entirely might be better. “Science-only editorial criteria” (as opposed to “science and perceived future impact editorial criteria”) is one possibility, but other folks may have more succinct terms. In my experience as a volunteer editor for PLoS ONE, the reviews that are submitted for that journal are no different from those submitted to “conventional” journals.

At risk of diverting the main discussion any more, terms like “peer review lite” are often taken as a pejorative because they’re often used by individuals who *mean* it as a pejorative. The cutesy spelling for “lite” certainly doesn’t help!

Better, but still clumsier than I want. There’s something linguistically confusing when describing the peer review process by referring to editorial criteria behind it. It would work for a statement like, “F1000 Research will feature post-publication peer review employing PLoS ONE style editorial criteria,” but might be harder when just describing the process itself. The phrase doesn’t really explain what’s different about PLoS ONE’s criteria, just that it has criteria (“PLoS ONE uses a peer review system based on PLoS ONE style editorial criteria.”).

I do agree that the difference is in the criteria used for the review, but I think that does affect the process (reviewing for A and B is a different process than reviewing for A, B and C). But the rigor applied individually to A, B and C should not vary.

I do take your point that some of the terms used were indeed meant to slight, but I think we’ve reached a point where the sensitivity levels are so high that one can’t even neutrally mention the difference without being accused of attacking the concept.

David, you’re right that it’s not easy to come up with a better term! You are right that PLoS ONE uses fewer criteria in their review process. I suppose the key point here is that it’s “fewer” rather than “less” — i.e. a difference is kind rather than in degree.

“Would “PLoS ONE style peer review” have been preferable?”

I think that is the least-bad term I have heard so far for this, yes. I quite like Andy’s “PLoS”-less attempt, “Science-only editorial criteria”, but it’s hard to imagine that not being misinterpreted, in turn, as a slight on the traditional approach.

I have a hard time with “Science-only” because I think impact and significance are important parts of science. I think we can all agree that some science is better than others, some experiments more important to a field than others. Maybe something more along the lines of “validity-only” or “accuracy-only”. But I suspect those aren’t the right words either.

It just now occurs to me that there is another problem with “science-only”: it wouldn’t apply to arts and humanities journals that used an equivalent model that reviews for soundness and quality but not for likely impact.

Nor would it apply to the social sciences, where Sage Open already has borrowed from the PLoS ONE model. Maybe the term “reliability” would be a reasonable short-hand way of describing the PLoS ONE approach to peer reviewing? It would cover both methodological soundness and accuracy, would it not?

“PLOS-One style” would not do because it refers to one journal – other journals were operating this kind of peer review before PLOS one and yet others do since.

One commonly used phrase to distinguish this type of peer review is “technical peer review” (ie the journal will publish if the paper is judged technically correct by the peer reviewer, with no criterion of “impact”). However, this can allow plagiarism (don’t ask!), so one has to also have the concept of novel.
So the best I can think of is “technical peer review of a novel finding”.

In her book titled “Planned Obsolescence” (NYU, 2011), which she posted in draft form for pre-publication peer review on MediaCommons, Kathleen Fitzpatrick tackles the question of incentivizing post-publication peer review and notes that the success of such a system depends on “prioritizing members’ work on behalf of the community” (p. 43). Reputation in such a system, she notes, depends on reviewing the reviewers, and one’s ability to publish would be based on how “helpful” one is in participating in group discussion. She admits that making such a system work would “require a phenomenal amount of labor,” but if “reviewing were a prerequisite for publishing, we’d likely see more scholars become better reviewers, which would in turn allow for a greater diversity of opinion and a greater distribution of the labor involved” (p. 46). Her book also contains some cogent criticisms of traditional peer review. I recommend it to anyone who thinks that it is time to develop some alternative to traditional peer review that might better match the needs of scholars in a digital age.

Coincidentally I received today a notice of the 7th international congress on peer review and biomedical publication. Anyone interested in these issues should consult the reports from previous congresses which cover in detail the broad range of questions that come up whenever we start thinking about the pros and cons of traditional peer review and how we might develop useful innovations. Links can be found here: http://www.ama-assn.org/public/peer/peerhome.htm

“The reviewers are not pre-qualified as true peers.”

Even ‘qualified’ reviewers may not be true peers of the authors in terms of expertise in ‘classical’ peer review — as shown by the frequency of irrelevant comments reviewers may offer and outright errors they may make. The weakest aspects of ‘classical’ peer review may be the process used to select reviewers, and unfounded assumptions about their ability to provide useful feedback.

“It is simply unrealistic to expect informed, well-argued opinions from those who have not been specifically tasked with the job of supplying them.”

‘Classical’ peer reviewers often fail to provide ‘well-argued opinions’.

It may not be helpful to dichotomize potential reviewers as “pre-qualified” vesus “unqualified”. Why be restrictive about who is ‘qualified’ or not to comment on anything in a published article? The more readers and commentators (regardless of their CVs), the better the chances that flaws will be detected. Whether it’s termed ‘post-publication peer review’ or not doesn’t seem too relevant to the ultimate purpose of allowing readers to provide feedback that may identify particular merits or issues.

“Today’s commentators seem to have many axes to grind.”

So do some ‘classical’ peer reviewers! ‘Spontaneous’ post-publication commentators may have information relevant to readers’ understanding of the evidence that was published. Conflicts of interest and potential biases missed during ‘classical’ peer review have been identified this way. If a commentator’s reaction seems to be based on personal issues, readers will notice and disregard the comment if they wish.

The Journal of Participatory Medicine published a debate on peer review and reputation systems in 2010, and the list of Commentaries here http://www.jopm.org/category/opinion/commentary/ provides links to “Peer review and reputation systems: A discussion” as well as other articles about the nature of participatory peer review. (Conflict of interest: I’m the author of one of them.)

To treat reviewers as peers, they must be held accountable for their comments. The authors don’t know who reviewer identities, but editors certainly do. In our journal, every review is rated, and the reviewers get a thank-you note explaining our decision, so they can compare what they found to what others found.

Editors must not hesitate to rebuke reviewers who fail in their commitments or who offer inappropriate and unprofessional comments. This is what keeps the system honest.

I disagree.

Yes, you make a good case that online comments are far from perfect as a form of post-publication peer review, and I agree with all of that. But there is an assumption that these problems are absent in traditional peer review. That assumption seems to be based on weak or non-existent evidence.

Peer review is a deeply flawed process. We stick to it, in the same way as we stick to democracy as a way of running a country, because it appears to be better than the alternatives. But let’s not forget that it has a great many problems.

The problems with online comments are doubtless not the same as the problems of traditional peer review. But are they any worse? I don’t see anything to convince me of that.

The point of the post was to analyze why comments aren’t the same as peer review, and shouldn’t be conflated with it. Your ideas that each has problems but different problems validates the main point — they aren’t the same.

Comments are closed.