Reproducibility of published research results is increasingly coming into question. Efforts from funding agencies and publishers are calling for more and more transparency around research data, but so far, little attention seems to have been paid to a crucial aspect of experimental reproducibility: publication of detailed methodologies. A new NIH-led effort seems a first step in correcting this oversight.
While access to research data is valuable for potential reuse and extension of research results, much of the emphasis stated for data policies has been to improve the reproducibility of the research behind it. As a recent PLOS blog post puts it:
Availability of the data underlying a published study is probably the most significant way in which journals can, now, ensure reproducibility of the published literature.
I’m not sure I agree. Being able to review the data does indeed allow one to see if a researcher’s analysis and conclusions drawn are accurate for that dataset. But it does little to validate the quality and accuracy of the dataset itself. I can look at the gene expression data derived from your cell lines and see if it really shows the activity of the gene you claim it shows, but I can’t tell if you really used the cell lines and conditions you claim you used.
If I have serious doubts about your conclusions, then your data is less valuable to me than your protocols–the detailed methodologies you followed to derive that data. I need to see if your approaches were methodologically sound. If I hope to replicate your results, I need to know exactly what you did.
This is more complicated than you might think. The smallest variations in technique or reagents can lead to major differences in results. The scant information offered by most journals’ Materials and Methods sections makes replication fairly impossible. Often when describing a technique, an author will merely cite a previous paper where they used that technique…which also cites a previous paper, which also cites a previous paper and the wild goose chase is on. Methodologies evolve over time, and even if you can track down the original source of the technique, it likely has changed a great deal over the years.
In the past, access to data has been required by most journals, but only upon request. If funders and journals are now declaring that this is no longer acceptable, and that all data must be made public, then shouldn’t we hold researchers to the same principle when it comes to methodologies?
I will admit to some bias in this area. I spent years creating and acquiring laboratory manuals for a publisher, and then was one of the creators and the Editor in Chief of a biology methods journal. The mid-2000s saw a brief boom in the rise of methods journals, with CSH Protocols, Nature Protocols, Nature Methods and JOVE all appearing over a few years. Each journal seems to be alive and thriving to this day, and it’s perhaps surprising to me that the trend seems to have fizzled out. Sources of trusted protocols are more valuable than ever, particularly with the demise of the print laboratory manual.
I suspect that some of the reason is because getting researchers to write up their protocols is a bit like pulling teeth (at worst), or running a review journal (at best). Most researchers don’t think about writing up their methodologies. In my experience, this meant that many articles had to be commissioned. That’s a lot more work for an editorial office than sitting back and letting the submissions roll in. With little career reward offered for the development and sharing of methods, most researchers weren’t willing to invest the time and effort needed to put together a detailed and well-vetted protocol.
Cut to 10 years later, and we’re now in the midst of a data journal boom, with announcements of new entrants coming regularly. Researchers, however, may be faced with the same dilemma. While it’s repeatedly suggested that credit be given for creating datasets that are reused, until that becomes a solid reality, the career incentives for doing the work to create data publications may not yet be there.
In June of this year, the NIH, along with the Nature Publishing Group and Science, held a workshop on the subject of rigor and reproducibility, that resulted in a set of guidelines for reporting preclinical research. Many of the principles set forth are meant to improve both experimental design, and the reporting of that design. This is tremendously important, as Francis Collins and Lawrence Tabak note, poor training in experimental design is a key factor contributing to problems in reproducibility. The NIH has pledged to develop a training module with emphasis on experimental design (and for those who can’t wait, I highly recommend this book on the subject, with the disclosure that I was an editor on the first edition).
Other suggestions, such as setting high standards for statistical analysis and careful identification of reagents, cell lines, animal strains, antibodies, etc. will also go a long way toward replication.
These sorts of efforts are aimed at the pre-publication stage and if followed will result in a better quality literature, though one wonders how they will impact editorial costs and how additional procedures will fit with journals that have deliberately pared down the peer review process. Once the paper is out though, two suggestions are offered to help subsequent researchers replicate the results: data/material sharing, and expanding journals’ Materials and Methods sections to produce accurate reporting and key information.
The Nature family of journals, for example, will be expanding their Methods sections to allow for more detail. While this is undoubtedly an improvement, I’d like to see both funders and journals go even further. If we require release of data, then we must also require full, detailed reports of the methodologies used to derive those data. These could be published separately and cited, deposited like data in a repository and linked, or simply included in supplementary materials.
Preclinical research seems an obvious area to start these sorts of requirements. But why stop there? Detailed methodologies would be of tremendous value across the spectrum of scientific research. The validity of many types of sociological studies, for example, depends greatly on how those studies were conducted. Why not offer all the gory details to better help readers understand whether the experiments were well conducted so we know whether the data is worth reusing? Beyond reproducibility, increased availability of trusted protocols would be a boon to scientific progress simply because more people would have more access to more techniques.
Lest you doubt the power of the development of new techniques, go back and look at the last decade of Nobel Prizes in Chemistry and Physiology/Medicine: development of super-resolved fluorescence microscopy, the development of in vitro fertilization, discovery and development of green fluorescent protein, introducing gene specific modifications in mice, RNAi.
That’s a significant number of prizes awarded to methodologies. If the Nobel Committee can retroactively recognize the impact of sharing powerful methodologies, why don’t we offer that same recognition in real time? While it’s great to see the data behind an important experiment, if I’m going to take the next steps (or even check to see if that experiment was right), then I need to know how it was done.