The eLife Assessment: Can Significance be Separated from Accuracy?

David Crotty

eLife’s bold metamorphosis from a research journal to a preprint review service has now been going on for half a year, and at least from the data presented in this progress report, it seems to be forging ahead successfully. It’s good to see experimentation and boundary pushing in scholarly communication, though I suspect the real point of challenge will come in future years, as the effects of increasing publication numbers (all papers that are sent out for peer review are to be published in the journal rather than accepting some and rejecting some) come into play for future journal metrics like the Journal Impact Factor (JIF). JIF is a lagging metric, so today’s actions will see consequences in two years. Frankly, I’m a little surprised that there hasn’t been a significant increase in submissions to eLife given its current 7.7 JIF and new editorial processes that ensure publication for any manuscripts that pass initial editorial review.

Ryan Cat meme image with text reading "Big If True"
Adapted under CC BY license from Rob Bulmahn.

Regardless, one of the more interesting aspects of the process has been watching the research community’s commentary as eLife upends what many traditionally expect from a journal, namely, leaving the authors in charge of the response to peer review suggestions, even to the point of choosing to ignore peer reviewers altogether and to immediately declare the paper published as a Version of Record (VoR), regardless of any flaws perceived. I was intrigued by a recent discussion on Bluesky (rapidly becoming my Twitter replacement of choice) around a recently published VoR paper. I don’t know the field in question well enough myself to offer judgment, but essentially the Bluesky commenter felt that peer review requests for basic experimental controls were blithely dismissed by the authors, calling the validity of the work into question.

eLife is not a megajournal that ignores questions of the significance of research results and only looks for accuracy. The eLife assessment summarizes both the significance of the findings and the strength of the evidence. To do so, the assessment uses a controlled vocabulary of terms for each aspect, ranging from “Landmark” to “Useful” for significance and from “Exceptional” to “Inadequate” for the strength of the evidence. Here, the VoR in question was declared by eLife’s editors to be “Fundamental”, the second highest ranking for significance (“findings that substantially advance our understanding of major research questions”), while at the same time the strength of the paper’s evidence was “Solid”, the fourth ranking out of six possible levels (“methods, data and analyses broadly support the claims with only minor weaknesses”).

The separation of the two concepts is where I struggle. To me it’s clear that one can separate accuracy from significance (did the authors do what they said they did, I don’t care if it’s meaningful). But is it possible to separate significance from accuracy? Under eLife’s system, you could theoretically have a “Big, If True” paper labeled an absolute “Landmark” in the field (“findings with profound implications that are expected to have widespread influence”) that is also labeled as having “Inadequate” evidence (“methods, data and analyses do not support the primary claims”). This seems paradoxical — this theoretical paper would be an incredible foundational work that changes the field, but also a set of results that is probably not trustworthy. Can shoddy, unsupported work really be considered to be of such significance?

In my opinion, significance must encompass believability, or in the words of Carl Sagan, “Extraordinary claims require extraordinary evidence.” If eLife’s editors are going to weigh in on how important a paper is, then its credibility must be a factor they consider.

What Will  the Collapse of the Humanities Mean for Scholarly Publishing Writ Large?

Karin Wulf

What will the collapse of humanities mean for scholarly publishing writ large, and more to the point when and how will scholarly publishers intervene to respond to the attacks on the humanities in higher education?

People use social media very differently. Some people want to (and expect others to) comment on the issues of the day, defined variably. I mostly comment about things that are in the 2nd quarter hierarchy of my professional, political (less often), and personal (rarely) concerns. I’m not inclined to hot takes anyway, and I can’t offer much useful perspective on most news. The top level professional issues are ones I deal with elsewhere, and I don’t often share much that’s personal other than pictures of my dog. That middle register is less appropriate though, as the crises in the humanities in the US and the UK in particular intensify and become more obviously and explicitly connected to politics. And it’s incumbent on all of us who participate in scholarly communications and are committed to research in the public interest to not just say more, but do more.

I’ve written a lot over the years here on The Scholarly Kitchen and elsewhere about the particularities of historical scholarship, humanities research infrastructure, and the ways that applying models and requirements affixed to STEM are not only inappropriate but undesirable – even dangerous. In a post in August about the Stanford president’s resignation in the wake of revelations and review about the integrity of his research, I wrote about how important it is that we – and by we I mean this entire community of scholars, librarians, publishers, and key infrastructure organizations, and I speak most knowledgeably about the US – understand our context in the post WW2 waves of federal investment in science and the consequent expansion of the research university. That expansion has driven extraordinary breakthroughs, developments, and insights across disciplines and, albeit not without bias and some real disasters, has also provided extraordinary social benefits. This is as true for history as it is for biomed. The devaluing of the humanities is part of the same trajectory that has driven the intensive investment first in research more broadly, but now in science in particular. We are neither the same, STEM and the humanities, nor do we exist in wholly distinct environments. Once a flagship public university is cutting language and other core humanities programs, there is no argument for retaining any fields or programs on the basis of their inherent value to humanity.

For the moment, possibly only a brief one, there is a platform available for cross-disciplinary, full scholarly communications community commitment to the integrity of the whole of the research enterprise. We should stand on it.

A New Draft on Recommended Practice for Communication of Retractions, Removals, and Expressions of Concern

Todd Carpenter

This week, NISO released a draft version of the Communication of Retractions, Removals, and Expressions of Concern (CREC) Recommended Practice for public comment. This Alfred P. Sloan Foundation-funded project led by the UIUC professor Jodi Schneider and myself aims to build consensus and common practice within the publishing community to make the marking and signaling of article retraction status consistent. In an effort to reduce the inadvertent spread of retracted or potentially inaccurate papers, the project defines and describes how publishers and aggregators should mark online systems, and how metadata should be updated to ensure the trustworthiness of the scholarly record. A free public webinar on the project was held last week on the project and its activities. The recording of the webinar is now available.  The public comment period will run through the end of November and all are encourage to review the draft and provide suggestions on its improvement.

David Crotty

David Crotty

David Crotty is a Senior Consultant at Clarke & Esposito, a boutique management consulting firm focused on strategic issues related to professional and academic publishing and information services. Previously, David was the Editorial Director, Journals Policy for Oxford University Press. He oversaw journal policy across OUP’s journals program, drove technological innovation, and served as an information officer. David acquired and managed a suite of research society-owned journals with OUP, and before that was the Executive Editor for Cold Spring Harbor Laboratory Press, where he created and edited new science books and journals, along with serving as a journal Editor-in-Chief. He has served on the Board of Directors for the STM Association, the Society for Scholarly Publishing and CHOR, Inc., as well as The AAP-PSP Executive Council. David received his PhD in Genetics from Columbia University and did developmental neuroscience research at Caltech before moving from the bench to publishing.

Karin Wulf

Karin Wulf

Karin Wulf is the Beatrice and Julio Mario Santo Domingo Director and Librarian at the John Carter Brown Library and Professor of History, Brown University. She is a historian with a research specialty in family, gender and politics in eighteenth-century British America and has experience in non-profit humanities publishing.

Todd A Carpenter

Todd A Carpenter

Todd Carpenter is Executive Director of the National Information Standards Organization (NISO). He additionally serves in a number of leadership roles of a variety of organizations, including as Chair of the ISO Technical Subcommittee on Identification & Description (ISO TC46/SC9), founding partner of the Coalition for Seamless Access, Past President of FORCE11, Treasurer of the Book Industry Study Group (BISG), and a Director of the Foundation of the Baltimore County Public Library. He also previously served as Treasurer of SSP.


1 Thought on "Smorgasbord: eLife and Significance vs. Accuracy, The Collapse of the Humanities, and a new NISO Draft on Retractions Standards"

David Crotty’s point that one cannot wholly separate significance from accuracy is hard to argue with, and I imagine that in practice eLife’s editors and reviewers would acknowledge the relationship and not give a Landmark significance to a paper that had Inadequate strength of support. But why not make that relationship formal? For example, an article rated Fundamental or Landmark could require at least Convincing strength of support.

When designing any system, one has to take into account what kind of behaviors the system will encourage. With eLife’s new editorial process, I guess we’ll find out.

Comments are closed.