Editor’s Note: Today’s post is from Micah Altman, MIT & Philip Cohen, U.MD. Micah is a social and information scientist at MIT’s Center for Research in Equitable and Open Scholarship. Philip is is Professor of Sociology at the University of Maryland, College Park and the director of SocArXiv.
Under pressure from both researchers and consumers of research, the practice of peer review is changing and new models are proliferating. Recently, the International Association of Scientific Technical and Medical Publishers (STM), a major academic publishing association (representing academic societies, commercial publishers, and scholarly publishing organizations), has made an effort to categorize these disparate models, which resulted in a draft from their Working Group on Peer Review Taxonomy. This comment emerged as a response to their call for feedback to that proposal.
Policymakers and the public at large have pressing needs for new scientific evidence to support timely decisions. The COVID-19 pandemic has dramatically increased pressure on scientists and scholarly publishers to produce and communicate emerging research even as budgets rapidly constrict. In response to this pressure, many scientific results are being disseminated through accelerated channels as preprints and working papers, or through accelerated review or invited symposia conducted by established scholarly journals. The review processes employed across and within these venues varies substantially, for example in terms of who conducts the review, the time elapsed, what text and materials are reviewed, and the criteria employed.
To make timely evidence-based decisions, scientists and non-scientists alike need to be able to understand how an emerging result has been vetted — whether that result is disseminated in the form of a preprint, working-paper, news article, or journal article. Information about the vetting process for a specific publication or general publishing venue remains difficult to find, daunting to compare at scale, and impossible for anyone (with the possible exception of a journal’s editorial board) to fully assess. Both policymakers and the scientific community need better views into practices for vetting scientific communication, and into how these are evolving. Obtaining a better view will require systematic, comparable, and scalable methods of describing peer review. This need has notably inspired the development of the Transpose database of journal policies on open peer review, co-reviewing, and preprinting; and the Doc Maps framework for representing the review and editorial process underlying research objects.
In this light it is a promising development that STM produced a draft taxonomy for peer review. The research community needs to make peer review — and how the function of peer review is communicated — more systematic, nuanced, and standardized. Formal metadata such as taxonomies can advance the state of research and practice. We offer these comments to help make the proposed taxonomy more useful to stakeholders who do not work directly for publishers by including elements to support scalable journal transparency; article-level evaluation; and strategic investment in scholarly infrastructure.
If the purpose of peer review is to make scholarship better, the purpose of a peer review taxonomy should be to promote better communication of scientific processes. Such communication should aid researcher-authors in selecting publication venues; support consumers of scientific communication in making evidence-based decisions; and guide funders to invest in better science.
During a period of reevaluation and change, the description of peer review processes should include both existing and emerging models. The proposed taxonomy focuses on the most established forms, which may reinforce barriers to innovation and create blind spots for policymakers and researchers. In order to meet today’s pressing need for timely policy decisions that are informed by scientific communication, what a particular system of peer review promises, how it is designed, and how it operates must be understandable and trustworthy.
Comments on the Proposed STM Peer Review Taxonomy
As motivation and background for the proposed taxonomy, the working group page states:
STM recognises a need for identifying and standardising definitions and terminology in (open) peer review practices. A Peer Review Taxonomy that is used across publishers will help make the peer review process for articles more transparentit [sic] will also enable the community to better assess and compare peer review practices between different journals.
The proposal itself, while providing some guidelines for application, does not communicate why a taxonomy is needed, nor how it will add value. While the draft refers to “transparency,” “comparison,” and “assessment” as motivating goals — the role of a taxonomy in achieving these values is not stated. (The STM Working group page also refers to a recorded presentation to provide further background, however this recording had been deleted. We focus our comments on the text of the proposal itself.)
We suggest considering how the taxonomy would be used by critical stakeholders within scenarios drawn from current consensus reviews of the scholarly knowledge ecosystem, scholarly communication, and open science — as adapted to journal peer review:
- Strategic investments: Monitoring the health of scholarly communication systemsIt is widely recognized that the ecosystem of scholarly knowledge is changing and there is little systematic, reliable, or open data characterizing those changes — especially with respect to peer review. The absence of a common evidence base means new proposals for policies and interventions suffer from prolonged debate as the costs of scholarly practices, norms of scholarly communities, and attitudes of scholars are often in dispute. Further, there is convincing evidence that scholarly processes and outputs have substantial bias and/or create barriers to inclusion; there are no common mechanisms for tracking these over time.Executive leadership in science publishing, funding, and research universities (and researchers in meta-science) have critical needs to assess the overall health of the peer review system, recognize trends over time, and identify emerging practices. An appropriately designed taxonomy – one that was flexible enough to reflect new practices, and was used in data collections and summarization — could serve these needs by contributing to a strategic-level view of the functioning of peer review in the scholarly ecosystem.
- Journal transparency: Helping researchers and publishers make better publishing decisions.Researchers aiming to disseminate their work are confronted with an increasingly large and heterogeneous set of publication options that vary greatly in terms of their costs and burdens and in the value added by the review and publication processes. Aside from the ubiquitous Journal Impact Factor or other crude journal rankings, there is no standardized information that researchers can use to guide their decisions.A taxonomy that reflects the elements that most strongly relate to the added value of the peer review, if adopted and used transparently by journals, would greatly aid researchers in making publication decisions. This approach could also clarify journal publishers missions, and strategies.
- Publication evaluation: Improving outside evaluation of published scholarship.In the age of COVID-19, policymakers and other stakeholders outside of the academy are under pressure to incorporate emerging scientific evidence into timely decisions. The scope, extent, and difficulty of the review process are critically important factors in weighing newly published findings. Further, tenure committees, and other evaluators of specific researchers need to evaluate their work not only for scientific soundness but also for novelty and potential impact.A taxonomy that reflects the scope, focus, extent, and selectivity of the review process, would be of use to policymakers and evaluators of scholarship.
These uses are not exclusive — a taxonomy could satisfy all three, and probably should. Neither are they exhaustive — there may be other stakeholder uses that could be described and documented that would motivate additions to the taxonomy.
The design or evaluation of a taxonomy requires understanding to whom it is intended to be useful, for what purposes, and in what circumstances. After all — taxonomies are simply one kind of descriptive model, and as the old saw goes: all models are wrong – but some are useful.
The proposal draft would be clarified by an explicit, clear, and detailed discussion of how it would add value in these or other scenarios, what outcomes or performance indicators might be used to evaluate its success, and what ultimate goals these outcomes advance.
Aligning scope and goals
The proposed STM taxonomy states several restrictions in scope:
- “Scope” of review (e.g., “novelty”, “impact”, “rigor”) — because the working group finds the extant differences in approach are not sufficiently well-defined.
- Application to journal articles only (as opposed to other forms of publication or research materials) — because it is here the authors identify the greatest need for a taxonomy.
- Exclusion of post-publication review approaches (the “F1000 model”) — because this model is not widely used.
Notwithstanding the stated motivations, we are concerned that these exclusions would render the taxonomy substantially less useful.
- The scope of review is critical to understanding its added value. Excluding different approaches to this question undermines the potential for some strategic comparisons. For example, “non-selective” review models, exemplified by PLOS ONE and others, offer a different value proposition for the review process, and imply a different business model, but are still within the practical range of options considered by authors. Similarly, processes designed to combat publication processes biased toward successful results (the “file-drawer” problem) cannot be adequately described without reference to scope. It would be useful for both authors and publishers to be able to classify peer review processes according to the variety of criteria employed, including validity, importance, presentation, audience, and research transparency. The existing ambiguity in this area underscores the potential clarifying value of a taxonomy.
- Restriction to journal articles. This is relatively unproblematic if the only use of the taxonomy is to promote transparency for publishers of journal articles. However, exclusion of pre-prints renders the taxonomy much less useful for publication transparency. Many emerging results (i.e., earlier versions of the same work) are published as articles or preprints or both. The review processes applied to these works outside of journals are highly relevant components of the article publishing landscape.
- Post-publication review, as applied by F1000 and others, is a bridge between preprints and traditional peer reviewed journals which, even if holding a limited market share at present, represents the functional range of publishing possibilities under consideration by authors and publishers.
Within the current scope of the taxonomy (and also relevant to the broader scope we favor), the current proposal elides a number of elements of the peer review process that appear highly relevant to potential uses for the taxonomy. Since these areas have been discussed in other taxonomies and analyses of peer review we draw attention to their importance with respect to the goals of a peer review taxonomy without describing each element in detail.
- Reviewer selection is widely recognized as critical to quality and equity in the review process. Important details include who selects reviewers (editors, authors, community or independent organizations); what pool or community the reviewers represent (in terms of discipline or institutional roles); and whether specialized expertise (e.g., in the field of study, methodology used, domain of measurement) is required
- Even in the case of articles published in journals, the range of evaluation varies. Reviewers may evaluate only the text of the main article (e.g., for significance, errors in the design); but also the supporting supplemental text or materials; replication code and data; and prior review comments if ported from another review process. The review of preregistration plans, whether as precursor to a submitted article or before the research is complete, are also within the range of evaluation considered by journals.Many pre-registration interventions commit to publishing results independent of perceived impact (scope), based on the design of research and its successful execution (criteria), as evidenced by the pre-analysis plan, analysis, and (often) replication materials (range). Interventions such as pre-registration plans demonstrate how scope, range of evaluation, and decision criteria, although separate facets, can be integrated in a way that substantially affects what one can reliability infer from peer review.
- The decision criteria for accepting or rejecting a publication has a substantial effect on publication outcomes, and has the potential to dominate any other factor. For example, reviews may be dispositive or advisory; articles may be accepted or rejected on the basis of a single review or journals may require a quorum of reviewers to respond; editors may have substantial power to “desk reject” or to invite articles and accept them despite negative reviews; articles may be accepted based on their design (e.g., based on the preregistration plan) or based on the specific results of the experiment.
A taxonomy that does not measure these aspects of the review process would be substantially less useful for the critical goals highlighted above. Authors currently often make decisions about where to publish, and readers determine what research to trust, based on an opaque mix of disciplinary tradition, reputation (which may be self-reinforcing), and inertia. For authors to make fully informed choices, for readers to identify important results, and for funders to decide what research to advance, we need a system for tracking emerging shifts in peer review across the community. In the absence of these components of the taxonomy, evaluation of peer review and publication processes will remain relegated to non-transparent considerations such as informal judgments or insider knowledge.
Practices and applications
The last section of the draft STM taxonomy provides recommendations for the deployment (“use”) of the taxonomy. This section focuses primarily on communication of both taxonomic categorization and additional information of the review process on “the article page.”
The proposal recommends some types of information be provided with each article, in addition to the taxonomic classifications. Although no specific rationale is provided, we agree that much of the information recommended, such as the number of reviewers and the dates of submission/review/publication are appropriate because they support journal transparency, article transparency, or strategic decision making. Further, the counts and rates related to the critical milestones and characteristics of the review process at the journal level substantially support these same goals. We recommend that journal statistics be provided openly as a complement discrete taxonomic classifications: — particularly such information routinely as is collected by journals and demanded by editors, including statistics on journal submissions, acceptance, desk rejections, review dispositions, and information on the size and composition of the reviewer pool.
Moreover, machine-readability is vital to ensure full transparency, support strategic decisions, and enable use of this information for comparison or decision making at any practical scale. In particular, taxonomic classifications and accompanying information should be provided in a way that is findable, accessible, interoperable, and reusable by both humans and machines. The FAIR principles provide widely recognized general design principles that should guide the treatment of this information, or any other information intended to advance open, accessible or reliable science communication.
During a period of reevaluation and change, the description of peer review processes should include both existing and emerging models. Designing or reforming a peer review process involves managing multiple goals such as timeliness, effort, equity and inclusion to achieve increased reliability in assessments of novelty, importance of results, rigor and other factors for targeted components of the scientific process. There is no ideal form of peer review in the abstract — whether particular tradeoffs are justified depends on the broader context in which that review is carried out. The proposed taxonomy focuses on the most established forms, which may reinforce barriers to innovation and create blind spots for policy-makers and researchers.
In order to meet today’s pressing need for timely policy decisions that are informed by scientific communication, what a particular system of peer review promises, how it is designed, and how it operates must be understandable and trustworthy. This requires clearly describing the scope of the vetting process; the range of content to which the process is applied; what reviewers and authors are permitted to know about each other and how they are allowed to communicate; how acceptance decision are made; how reviewers are selected; and basic statistics about submissions, acceptance, desk rejections, and review dispositions. And, it requires making this information transparently available in a FAIR way.