Can a scientific paper be methodologically sound, but just not report any significant or meaningful results?

The answer to this question seems rather obvious.  But before accepting what would appear to be a truism in science, I’d like to explore what is “methodologically sound” science (and its variants) and what it implies.

foundation

In recent years, a number of publishers have launched open access journals designed to reside at the bottom of their peer review cascade.  These journals will accept papers that may not report novel results just as long as they contain a sound methodology.

Manuscripts considered for acceptance in PLoS ONE, for example, are not required to advance one’s field, but are required to be “technically sound.”  The scope for BMC Notes is exceptionally broad, requiring little beyond that a paper is “scientifically sound.”  And BMJ Open‘s criterion for acceptance is somewhat more positively worded, although still conspicuously vague, requiring that studies be “well-conducted.”

These acceptance criteria wouldn’t be so contentious if they were viewed only in isolation, as a way to promote the values of the journal.  But they are not.  They are often used as a denunciation of mainstream journals and are clearly dismissive of those who decide the fate of manuscripts.  This perspective is best expressed in the information page for PLoS ONE:

Too often a journal’s decision to publish a paper is dominated by what the Editor/s think is interesting and will gain greater readership — both of which are subjective judgments and lead to decisions which are frustrating and delay the publication of your work. PLoS ONE will rigorously peer-review your submissions and publish all papers that are judged to be technically sound. Judgments about the importance of any particular paper are then made after publication by the readership (who are the most qualified to determine what is of interest to them).

What constitutes whether a paper that is “technically  sound” is much more nuanced and much less clear than it appears.  In fact, you will not find discussions of what makes a sound methodology in any methods textbook.  So, I’ve asked several of my colleagues (in the social sciences, biomedical and information sciences; two of whom teach methods courses) about what “sound methodology” means to them.  According to these researchers, a paper may be sound if:

  • it uses techniques that are appropriate to the question asked
  • it does what it purports to do — in other words, if the researchers claimed they ran a Western blot, there must be some evidence that it was conducted, like an image of the gel
  • it treats its data correctly and runs the appropriate analysis
  • its conclusions don’t overstate its results

Three of my colleagues provided much broader, gestalt-like answers:

  • “It’s complicated.”
  • “You have to look at the entire paper.”
  • “It all depends upon the context. You can’t be expected to run real-time PCR in the jungle.”
  • “Appropriate methodology is what your community accepts as appropriate methodology.”

From these answers, evaluating methodology is not a binary decision — right or wrong, sound or unsound — but requires the context specific to a field.  No method is perfect or ideal, although some are certainly more appropriate than others. And making that decision requires expertise, which is the very raison d’etre of editorial and peer review.

And this is why I have a problem with coupling the word “sound” with methodology, technique, or science.

The word “sound” implies that something is safe, strong, and  secure, like the foundation of a building, the very structure upon which a whole edifice is built.  Sound foundations are solid,  stand firm, and resist direct attacks, while weak foundations crumble over time or cannot withstand the assault of a competing theory or contradictory piece of evidence.

Presidents make frequent use of the “sound foundation” metaphor when talking about the economy during recessions because they give people hope that, when the building appears to be crumbling — lost jobs, high unemployment, stagnation or deflation — a new economy can be rebuilt upon a strong foundation.

“Sound” also implies that something is healthy and vibrant — science that spawns new hypotheses and directions for further research.  Unsound research is weak, lacks fitness, and is unable to thrive.

Neither of these interpretations of “sound” can be applied to scientific method.  Articles reporting negative, confirmational, or ambiguous results don’t get challenged.  They sit vacant, to crumble and decay with the passage of time.  Nor is the sound-as-health interpretation a valid comparison: only articles challenging established dogma or reporting, at minimum, positive results are capable of spawning new hypotheses and advancing science.

In sum, the connections make between “sound” and “methodology” creates mental frames that simply do not coincide with how researchers actually evaluate methodology.

But there is more that is bothersome.  By accepting the “sound methodology” metaphor, the only difference between articles published in top journals and those appearing in archival journals is, to paraphrase PLoS ONE, what an editor thought was interesting and would attract readers.  Or, to quote , PLoS ONE‘s community organizer, Bora Zivkovic, during one of his regular public rants:

When they say “quality science” they don’t mean “well-done science” like we do, they mean “novel, exciting, sexy, Earth-shaking, mind-boggling, paradigm-shifting science”, i.e., the kind that gets published in GlamourMagz. Then they do a circular logic trick: what is published in GlamourMagz is good science. When they say “peer-review” they don’t mean checking the quality of research, they mean keeping non-sexy research out. When they say “selective” they mean “elitist” and “keeping the non-sexy proletarian science out”

Rationalizations like this may help rally the troops or provide some solace for a rejected author, but they do a disservice to science by promoting an unrealistic view of the scientific method and a corrupted public image of the editorial and peer-review process.

“Sound methodology” suggests an ideal match to a scientific question that never quite exists in empirical science.  For all that the phrase implies, it should be replaced with something much more accurate, like “appropriate” or “persuasive” methodology.  Granted, it doesn’t connote the same trust and confidence as the word “sound,” although it may describe the process more accurately and honestly.

Phil Davis

Phil Davis

Phil Davis is a publishing consultant specializing in the statistical analysis of citation, readership, publication and survey data. He has a Ph.D. in science communication from Cornell University (2010), extensive experience as a science librarian (1995-2006) and was trained as a life scientist. https://phil-davis.com/

Discussion

34 Thoughts on "Can Open Access Journals Guarantee Sound Methods?"

This speaks toward well-intentioned, but ultimately futile efforts to democratize science. The prevailing Northern California post-hippy philosophical bent of the internet assumes all content and all writers are equally valid. But this is not the case in areas where expertise matters. It’s what I find so frustrating about Wikipedia as one example of an area where expertise is actively shunned in favor of pedantry.

Science is, by its very nature, a meritocracy. Some science is better than other science, more important, more meaningful. In an age of information overload, it makes no sense to cut out valuable, time-saving filtering mechanisms just because you’re anti-commerce, anti-corporate.

Personally, I think the editors of “GlamourMagz” (many of which have top researchers on their editorial boards) do a very good job of filtering. If everyone is equal on the Internet, why are their opinions any less valuable than those of the masses?

As a regular paper reviewer, I generally did as good (or as bad!) a job of reviewing open access papers, as for those which were not open access. I’ve generally now stopped reviewing for (and publishing in) journals which I can’t read freely.

Science may be meritocratic, but who should judge the merit? Often it requires history to make this judgement, but this can only happen for knowledge that is available. Having a pre-selection based on an editors opinions is not sensible, to my mind.

The original article also makes a mistake in conflating open access with a “the bottom of the peer review cascade”. It’s just not true; PLoS also publish journals which do not have the same criteria as PLoS One. The article is entertaining, but ultimately, it’s attacking a strawman.

Phillip,
If you read the cited post, Cascading Peer-Review you will note that I do not make a categorical argument, but point out a general trend among new open access journals launched by traditional subscription-access publishers. Indeed, I also point out a exception, mBio, which does not view itself at the end of a peer-review cascade, but at the beginning.

This is hardly a strawman argument.

The issues of selective review and access are entirely separable (see PLoS Biology, PLoS Medicine and many other selective journals that offer authors an open access option).

Who should judge the merit? Everyone. That means that pre-selection by a highly qualified panel of peer reviewers and an experienced editor are certainly valid means of filtering and judging. I’m still not sure why random readers are somehow considered better judges than these qualified experts. Regardless, having one does not eliminate the other. PLoS Biology has editors and peer reviewers who do careful selection based on significance, then they allow readers to weigh in with article level metrics, comments, etc.

Aren’t more filters better than fewer?

It depends on the filters, and how those filters are used.

I don’t think anyone would say NCS don’t publish great research, it’s that they publish a tiny percentage of the great research, and the criteria that determine which tiny percent should not be identified with scientific quality.

Also, the “random readers” you refer to are the people who use the research in their work, and who will ultimately cite (or not) the paper. Essentially, they are a much larger sample of the peer reviewer pool used at the first filtration stage. While it is technically possible for my Aunt Trudy to comment, given the volume of published science, can you really imagine anyone without a fairly deep interest in the subject matter doing so?

Why shouldn’t the criteria used for choosing material to be published in a selective journal be identified with quality? It shouldn’t be the only measure of quality, but I’ve yet to hear an argument against using experts to filter submissions for ones they think are significant. We use these same methods to determine potential and quality in grant applications, in selecting speakers for meetings, in selecting recipients for scientific awards. Should all of the above instead be put to a popular vote? Do popularity contests always select for the highest quality winner?

Are you also suggesting that only researchers who will cite (or decide not to cite) a given paper are qualified to judge its significance? Are scientists really that limited in the scope of their knowledge? Aren’t the peer reviewers used by journals to make these decisions “the people who use the research in their work, and who will ultimately cite (or not) the paper”? How does having a small panel of experts making an initial judgment on a paper prevent the larger pool from making their own judgments further down the line? Why must we eliminate one of these methods, rather than combining the two for even better assessments?

How would you increase the size of the pool making those sorts of judgments but also ensure that you’re ruling out the likes of your Aunt Trudy from participating? You may think it’s unlikely that she’d like to have her voice heard, but let’s imagine she’s a staunch, religious supporter of teaching Creationism in school and the paper in question discusses evolution. Should she get to have her say?

Because those journals have strong considerations that are not related to quality.

Perhaps it depends on how one defines “quality”. And let’s be frank, PLoS likes a flashy paper that brings in publicity as much as anyone else. I don’t see them shying away from issuing press releases.

I think you are off the mark Phil. There is a clear distinction to be made between importance and simple soundness. We even have a way of measuring importance (after the fact) which is citations. And the top journals clearly compete for the most important papers. There is nothing wrong with also publishing the less important papers, because any may be important to somebody, some day.

Nor is soundness the problem you suggest. The term derives from logic, which you seem to have omitted in your metaphorical excursions. In deductive logic it means the conclusion follows from the premises and the premises are true.

In inductive logic, which is what science mostly uses, soundness is indeed more complex. For example it includes all of statistical reasoning. But the folks you are attacking use peer review to assess soundness. Sounds right to me.

David,
I’m not going to speculate on whether publishers are using “sound” in a deductive or inductive way. Indeed, I argue that there are better ways to express this idea without risking ambiguity. In my last paragraph, I propose “appropriate” and “persuasive” as alternatives.

However, my main point in the post was this:

Adopting the phrase “sound methodology” allows one to go beyond stating the selection process of the journal. It is clear (and I cite both PLoS ONE and Bora Zivkovic as examples) that the phrase is being used for other purposes.

You can’t do that with the phrase “appropriate (or persuasive) methodology”

Phil, perhaps I have misunderstood your point. I thought you were arguing that there should not be journals that publish papers solely on the grounds that the methods are sound (or appropriate, if you like). Now it sounds like yours is merely a semantic point, that is you prefer the term “appropriate” to “sound” on psychological grounds.

Yes, I was making merely a semantic argument, although “merely” may understate the importance of language in science.

I have no problems with journals that publish articles that are not “interesting.” But I do have a problem with calling these articles “technically sound” because of what that phrase implies.

The latest article by Harnad and others published in PLoS ONE hardly stands up to methodological scrutiny, as I pointed out here. It IS interesting, however, but far from “technically sound.”

The semantic issue is worth pursuing, but your title and other lines raise substantive issues that are very different. You even talk about corruption!

As for Harnad, keep in mind that soundness is itself controversial in specific contexts. Also, “appropriate” can be even more praising than “sound.”

I’m an academic editor at PLoSOne. When I ask people to review papers, they are the same people I would ask to review it if it was going in a more specialized our higher IF journal. They are leaders or active researchers in the relevant field. Peer reviewers, in my experience and in discussion with colleagues, to not review things differently according to journal UNTIL AND UNLESS they are asked to specifically say if they think it is “appropriate” for that journal (sometimes this is purely the editor’s call).

In at least three cases, I have asked someone to review a paper that they have already reviewed (and rejected) elsewhere. In none of these did they “relax” their criteria because it was now under consideration at PLoSONe. If anything, they were angry if the authors had not addressed their prior concerns and became more exacting.

Since the potential reviewer pools for a paper are essentially indentical regardless of journal, your scenario assumes a premise in which a given reviewer would evaluate experiments (methods, presentation, interpretation) differently according to which journal asked them to review it. In my experience, this is simply not true (except as noted above)–I’d be interested if you have evidence that it is.

Miko,
If you look at PLoS’s guidelines for review, they are clearly different depending on the journal. PLoS Biology selects for originality, importance and general interest. PLoS ONE does not. I’ve copied their guidelines below:

PLoS Biology:

“To be considered for publication in PLoS Biology, any given manuscript must be exceptional in the following ways:

* Originality
* Importance to researchers in its field
* Interest to scientists outside the field
* Rigorous methodology and substantial evidence for its conclusions”

PLoS ONE:

“To be accepted for publication in PLoS ONE, research articles must satisfy the following criteria:

1. The study presents the results of primary scientific research.
2. Results reported have not been published elsewhere.
3. Experiments, statistics, and other analyses are performed to a high technical standard and are described in sufficient detail.
4. Conclusions are presented in an appropriate fashion and are supported by the data.
5. The article is presented in an intelligible fashion and is written in standard English.
6. The research meets all applicable standards for the ethics of experimentation and research integrity.
7. The article adheres to appropriate reporting guidelines (e.g. CONSORT, MIAME, STROBE, EQUATOR) and community standards for data availability.

Yup, but that doesn’t change what I said. Peer reviewers do not evaluate experimental evidence differently. You notice that some of PLoS Biology’s criteria are not related to experimental “quality,” right?

Miko,
I have provided you with very different reviewing guidelines and selection criteria for PLoS journals. I do not understand what you mean by “experimental ‘quality'”. Could you define first?

Your question is whether open-access can guarantee “sound methods.” My point is that “sound methods” are guaranteed (or not) by a component of peer review that does not vary according to publishing model.

Editorial criteria such perceived importance, relevance to journal field, priority, breadth of interest in the conclusions…. these are different for different journals, but have nothing to do with publishing model and have nothing to with the how experts in a field will critique experimental results.

I am wondering what relationship you think exists between publishing model and reviewers’ ability to vet experimental data.

Miko,
I think you missed the point of my argument. I argue that the phrase “sound method” is a misnomer that allows one to draw conclusions that are both disingenuous and harmful to science. I do this by stating with a claim, then unpackaging that claim to test its validity.

To answer your last question, there is no relationship between publishing models and reviewers’ ability to vet experimental data.

The author/funder pays model has allowed for the success of journals like PLoS ONE or BMJ Notes that would not have been possible under a reader/subscriber pays model simply because there is a limited (and shrinking) market willing to pay for journal content like this.

Sorry, I don’t think to working scientists (and peer reviewers) the concept of “technically sound” is particularly confusing or ambiguous, the fact that it is context- or field-dependent doesn’t make it so, nor does your reading more into the word “sound” than any of the practitioners do.

To the extent that there is variation in understanding what this means, it will necessarily affect all journals equally, since all seek to publish only technically sound work (within the scope of their editorial criteria). I’m trying to understand why you think open access or funding structure affects this in any way.

As I read Phil’s post, the reason that OA journals are the subject is perfectly clear — they have acceptance criteria that they proudly state as different than traditional journals. So, OA journals themselves draw the distinction.

The variation Phil found interviewing practitioners of science indicates that the idea of “soundness” is variable, is linked by many to the importance of the research, and goes beyond “appropriate.” Therefore, to promise “sound” methodology is more loaded than people realize.

Their criteria are different in that they are focused. They are explaining (and listing quite clearly) which criteria they don’t use. I think it is difficult to read into this anything suggesting that the way in which they determine technical soundness differs from any other journal.

I am indeed questioning the assertion that a given reviewer’s determination of technical soundness will vary by publication (I fully agree it varies by subdiscipline and is related to the conclusions the authors draw). Contrary to what you state, none of the anecdotes presented above as evidence for variation in the usage of the term are related to publication venue or perceived importance, and my personal experience as a reviewer and editor suggests to me strongly that the assertion is not true.

Miko may be interested to know that in university press book publishing it does happen that reviewers will recommend manuscripts differently according to the “prestige” of the press for which they are conducting the evaluation. This has more to do with perceived “importance,” however, than it does with anything that might be construed as the work’s methodology. If a work is flawed methodologically, that will count against it when being evaluated for any publisher. But what is considered “sound” varies greatly by academic orientation: what would be “sound” to a postmodernist literary critic would not be so to a more traditional textual critic.

Sandy, that’s my point exactly, and true of journals too in cases where reviewers are asked to use the criterion of perceived importance (e.g. all high IF journals–even open access ones). My point is that whether or not they are asked to use “impact” as a criterion has no affect on their assessment of methodological soundness. (Although I have a suspicion–but no evidence–that when asked to ignore concerns about “impact,” reviewers are sometimes more thorough in interrogating experimental methods and results, as they cannot rely on subjective criteria to justify their recommendation.)

Furthermore, as you note, “soundness” may vary by disciplinary conventions, but the pool of potential reviewers is determined by discipline, not publication venue.

I am an editor who has worked both for one of the ‘glamourmagz’ and for a more specialist journal that accepts work in a specific subject area. My experience suggests that, while the pool of reviewers for any particular topic may be the same for both types of journals, not all reviewers in a particular field are equal, and the ‘glamourmag’ will often be able to secure the more experienced, higher quality reviewers from a particular field.

The reason for this is simple – the papers have already been considered by an editor and only those which are considered highly interesting are sent out to review. The reviewers, who are undoubtedly busy seem more willing to accept a review task for a paper which reports significant results rather than the larger number of papers that simply report ‘sound’ or ‘appropriate’ methodology.

There are obviously far more papers that have report appropriate methodology than there are that report significant results, and it’s clear which ones reviewers would prefer to spend their time on.

That’s a good point Steve, but I know of no data confirming this. You also assume that “more experienced” makes for better reviews. Contra this, an article in PLoS Medicine showed that academic rank does not predict review quality (in medical research). In my experience, postdocs or junior faculty are the most thorough reviewers (not burned out, fewer axes to grind). I think we can all think of intuitive reasons why being a busy head of a large lab with lots of travel and administrative duties might not correlate with thorough review (though the subjective assessment of impact/importance might be more consistent with journal expectations).

That said, I have never had trouble getting prominent figures in the relevant field to review for me at PLoS ONE (you’ll notice there are many on the editorial board as well). Not sure if this is generalizable for other lower-IF journals or other fields, or maybe PLoS ONE still enjoys special enthusiasm from the community.

We do keep track of who reviews papers, with editor ratings. Someday maybe someone will do analysis of H-index by review quality, it would be interesting.

I hadn’t meant to give the impression that by more experienced I necessarily meant prominent ‘big names’ – I agree that postdocs and junior faculty often provide great reviews.

But any editor will quickly realize who are the best reviewers in a field based on the usefulness of the reports they provide. Those reviewers will find themselves in demand from many different journals. And when they have so many requests they will likely agree to review only the papers that they really find interesting.

Phil, I am still wondering what is going on with the title of this post. It implies a connection between editorial criteria (or review quality) and open access. As you pointed out, PLoS Biology and PLoS ONE have very different criteria, as do all the BMC journals. The title of the post is provocative (and aspersive) regarding OA, but then there is nothing about the implied relationship in the text, just a problematization of the wording of one OA journal’s editorial criteria.

Miko, I’m sorry you interpreted the title that way. It was not intended to be “aspersive,” although I would admit to being “provocative.”

My main point is that the phrase “technically correct” (and its variants) is problematic. This phrase, however, has become the criteria for several new open access journals, so there is a relationship, but not the kind of categorical relationship it may have implied to some readers.

A more technically correct title (e.g. “Polyseme in Manuscript Selection Criteria in Scientific Journals”) would have been systematically ignored.

Comments are closed.