Don’t get me wrong. I do understand the value of peer review in scholarly publishing, but I still think, we should end the human-dependent peer-review system and move to a completely AI-based one. The main reasons for this proposal are: human-dependent peer review is inequitable, suffers from injustice, and is potentially unsustainable. In last week’s Ask the Chefs piece on peer review, I shared a few examples; let me give some more below.

1) We say reviewers conduct peer-review as an academic responsibility or to fulfil the expectations of the scholarly community. Despite peer-reviewers’ devotion to the academy, I wonder how many of the 1,799 universities from 104 countries ranked in the THE World University Rankings 2023 are considering voluntary peer-review service as an assessment criterion while appointing, promoting, offering tenure, or annually evaluating their academic staff. Some universities consider it (e.g., Technical University of Munich, University of Warwick, and University of Pittsburgh). What about the others?

2) We are told that another motivation for peer-reviewers is to learn about the latest research before it gets published. It seems that millions of openly available preprints as well as research reports, post-graduate theses, posters, presentations, and conference proceedings are not enough to learn about new unpublished research.

3d rendering humanoid robot reading a book in front of a chalkboard covered in mathematical equations and symbols

3) Like many of you, I also see peer reviewing as a “good karma” activity — you help strangers (even if you know them, or at least their names) in your discipline to communicate their research better, and some other strangers will do the same favor for you when you submit your article to a journal. It has recently been reckoned that, if scientists would conduct at least one peer review per article they publish, that should ensure sustainable functioning of the peer-review system. But it isn’t so simple in reality. In my experience working on the editorial side of journals, recently-published authors regularly decline the same journals’ requests to review a relevant manuscript — reciprocation or good karma doesn’t happen as much as one might hope.

4) Further on reciprocation, if we expand the scope of scientific review by including research project proposal review, we may see another aspect. The logic of good karma won’t work here, since many reviewers are not eligible to apply or won’t apply for certain project funds. A personal story: after reviewing for a reputable research funder for a couple of years, I proposed to them that they involve me as a volunteer in their other strategic activities. I found it disappointing that my enthusiasm for wider collaboration wasn’t reciprocated after repeated requests, and I started finding my six-year engagement with their review-system-without-incentive kind of exploitative. So, when they recently requested that I review a document for them, I had to decline, and removed myself from their reviewer pool.

5) Speaking of exploitation, in 2020, about 22 million peer-review reports were produced — an unknown portion of these stem from ‘peer review ghost-writing’. Peer review ghost-writing takes place when someone else does the actual review without being appropriately acknowledged. As a solution, the concept of the “co-reviewer” has been introduced where an invited reviewer can bring in others to prepare a joint review. But to me, it is another pathway towards expanding the voluntary reviewers pool, thus expanding the scope of prevailing inequity.

6) Peer reviewers around the world in 2020 spent 130 million hours (equivalent to about 15,000 years) reviewing. The good thing is they are now getting recognized for their contribution through Clarivate’s Web of Science Reviewer Recognition Service, IOP Publishing’s Trusted Reviewer Certification, or Frontiers’ disclosure of the reviewers’ names on published articles, for example. But, again, what material benefits does peer review bring to the reviewers — discounted price while purchasing books from the publishers, free access to subscription journals, or discounted/waived article processing charges (APC)? Recently, blockchain-supported non-tradable, non-monetizable token recognition system and “Voluntary Contribution Transaction Systems” have been proposed for researchers to get tangible benefits from voluntary contributions such as peer review. But will such systems ever be seen in real world?

7) More on recognition, the monetary value of the time spent by China-, UK- and US-based reviewers was around US $2.5 billion in 2020 — we can only imagine the total global value. Some journals published by regional learned societies or institutions pay their local reviewers a nominal amount, which is sometimes fully charged to the authors. We may criticize paid peer review on the ground of adding undue bureaucracies, additional exorbitant costs, conflicts of interest, and weakening integrity. We may also say paid review will add further injustice, especially to the existing Global North-Global South divide in the scholarly ecosystem with its additional financial burden.

But, we are not acting enough to reduce the geographical and economic inequity maintained by other financial instruments, such as the article processing charges (APCs) for open access, the use of which has been increasing exponentially. To publish one paper, authors on average pay close to US $2,500, which doesn’t include a reviewers’ honorarium. (A note on APC rates: If an article is, say 5000 words long, to publish its first four words, the authors have to pay on average US$ 2! Just for comparison, in today’s world, a person is called extremely poor if they survive a day with less than US $2.15).

So, we have sufficient logic not to financially (nor in any other tradable manner) compensate peer reviewers, but at the same time, we seem fine with a system where some scholarly publishers can enjoy staggering profit margins exceeding the world’s top e-commerce, and tech companies. We are also seemingly okay with the US$ 10-billion journal publishing industry capitalizing on the altruism of its peer reviewers. It doesn’t sound quite fair to me.

The only way to make the situation fairer is by ending the human-dependent review system. We should invest more in AI-related components of journal review system, and gradually move away from the current human-dependent one. The arguments against the AI-dependent review are that AI can’t do critical thinking like humans and is based on algorithms that are frequently biased. We need to properly train AI to reduce the prevailing algorithmic limitations. Reviews of the existing AI-run article review software and models show certain degrees of effectiveness and efficiency, but that they are not quite ready to replace the human reviewers.

I therefore propose five phases to make the transition from 100 percent human-dependent review to a completely AI-based one. In the First Phase, where almost all journals fall now, we don’t use AI at all in the peer-review process, as Elsevier’s AI policy, for example, categorically asks reviewers not to take the help of AI to conduct peer review. To be in the Second Phase, we must improve the performance of AI-based review system to make efficient initial quality checks or to make potential desk rejection by quickly assessing if a submission matches the journal’s scope, and if its overall structure, plagiarism level, language, and coherence among sections are sound, and if it meets the basics of research ethics, integrity, transparency, and reproducibility. Many tools are already available to check these elements, but each separately. After surviving the initial AI screening, the manuscript then goes to human reviewers. And the editors make decisions based on human reviewers’ comments and recommendations.

The following three phases require a huge technological leap. But, at the pace AI is currently progressing, there is a potential for the AI-dependent review tools by the time we’re ready for them later on.

In the Third Phase of the progress, we train AI to assess the quality of human review reports and add complementary notes. Some models are already doing this, with some levels of consistency and accuracy. In this Phase, the authors respond to the human reviewers’ comments. And editors make decisions based on human reviewers’ comments and recommendations, and AI’s notes. In the Fourth Phase, journals engage AI as one of the reviewers of a manuscript, and the authors respond to both human and AI reviewers. Tools, such as ResearchAdvisor and a neural network-based one are currently available to serve as an ‘AI reviewer’, but all show varied levels of limitations, demanding more research in this sector. Nevertheless, once we are in this Phase, at least one human reviewer and the AI reviewer should check a revised manuscript. The editors make decisions based on all reviewers’ comments and recommendations. The Fifth Phase of the evolution happens when AI becomes the sole reviewer of a submission, authors respond to its comments, and AI also comments on the revised manuscript. Editors do the final reading of the revised manuscript considering the AI’s final comments, and make their decisions.

The above sequence may seem to be obvious and straightforward, but I feel there is a need to talk about such obviousness in the publishing ecosystem. I sometimes find the publishing industry, being led and guided by individuals and entities from a wide range of sectors, focuses too much on hardcore technological innovations. It gives less attention to the scaling up of these technologies in diverse social-cultural-political systems, which are made up of humans with millions of combinations of knowledge and capabilities, preferences and prejudices, perspectives and attitudes, and integrity and morality. We often don’t put in sufficient time and effort to prepare such human systems for technological innovations. That’s why the publishing sky seems like relentlessly displaying meteor shower of innovations. Innovations awards given away by publishing societies and entities underscore that too.

We’re in the midst of Peer Review Week 2023, which has the theme ‘Peer Review and the Future of Publishing’. I wonder, in this month, can a few like-minded publishers’ associations pool their resources, expertise, and experiences together, and start a concerted effort to lead a rationally-paced, effective, long-term transition from the human-dependent peer review to a fully AI-dependent system for our journals? At the same time, can they allocate time and resources to support activities so that the publishing sector and its stakeholders and actors are able to get ready for such a transition at an equal pace?

Haseeb Irfanullah

Haseeb Irfanullah

Haseeb Irfanullah is a biologist-turned-development facilitator, and often introduces himself as a research enthusiast. Over the last two decades, Haseeb has worked for different international development organizations, academic institutions, donors, and the Government of Bangladesh in different capacities. Currently, he is an independent consultant on environment, climate change, and research systems. He is also involved with University of Liberal Arts Bangladesh as a visiting research fellow of its Center for Sustainable Development.

Discussion

10 Thoughts on "Ending Human-Dependent Peer Review"

Until we have AI with some form of actual understanding and reasoning capabilities (which we don’t, and won’t have for the foreseeable future), “AI-powered” reviews will just cement current biases and disguise them as “objective”. Just like AI-powered HR recruitment does, or any form of AI-powered human screening.

AI tools for peer-review is tech solutionism: instead of adressing the core underlying issues of publisher’s greed & the insane demands of “publish or die” in academia, let’s just slap some AI on it and everything will be fine.

What we need is to reward quality rather than quantity of publications. We need publishers to hire actual humans to do the pre-review screening (they can afford it), and to implement robust procedures to limit biases in this screening. Judging the quality of the paper itself, however, require actual experts.

AI makes up citations (see the controversy a few months ago about a legal case where an attorney relied on AI for his research). How could we possibly trust it to recognize when a submitted paper has a decent bibliography of relevant sources, or is full of junk and made-up sources itself? Alas, in response to the first commenter, not all publishers can pay reviewers — I run a small society journal in the humanities. There’s no money anywhere in the process — we designed it just to break even on subscriptions, and if we drop print, there won’t be any money involved at all. So that can’t be the solution for all fields.

An interesting vision.

Why stop at peer review? Given that the latter phases of the model rest, apparently, on the expectation of a “huge technological leap,” it seems to me to beg the question of just when someone will propose “five phases to make the transition from a 100% human-dependent *science* to a completely AI-based one.” Human authors and editors, after all, are prisoners of their own biases and selfish motivations as much as reviewers are, and seem potentially as easily replaced in such a framework.

That conjures an interesting scenario for Phase Five in the post’s evolutionary scheme. An AI becomes the sole reviewer of a submission from another AI reporting AI-built science. The AI author responds to the AI reviewer’s comments. The AI reviewer comments on the AI-revised manuscript, and an AI editor makes the final decision on consideration of the AI reviewer’s final comments. What could be more fair? (Or fast — submit-to-accept times measured in seconds …)

While this no doubt sounds a bit shrill and ridiculous, it does seem one reasonable endpoint to be pondered after the predicted “huge technological leap.” It is hard for me to envision particularly good outcomes from such a system, but that is probably just a failure of imagination on my part.

I think you’re exactly right Stewart. It’s just a matter of time (maybe a short time) before we cut AI loose to review 10,000 papers on a particular topic and report back new discovery, new connections, identical phenomena in different fields (that have heretofore been described in different ways), and so on. At this point, humans will peer review the science that AI generates, at least until we trust AI to do this work alone.

What will scientists do then? They will still generate raw data to feed into the AI system, and for now anyway, will still prompt the system to perform certain tasks, and will still double-check (“peer review lite”) the ideas that AI generates, but a lot of the high-level original thinking, synthesis and analysis—with enough training, practice, and preparation (data standards, etc.)–will no doubt be conducted at least in part by machines instead of humans.

So to Dr. Ifanullah’s essay, I would say that it’s quite likely we’ll see AI helping with peer review in the next few years, but more along the lines of editorial-related tasks like improving grammar and checking for plagiarism and fakery. And this will be a good thing. Where AI will really impact with regard to peer review, I think, is by turning humans into peer reviewers, maybe not in a few years because there’s still a long way to go in terms of making data usable, but not in the distant future either. At this point, we will need more peer reviewers for both human-generated work and AI-generated work. Oops.

Science and the scholarship it relies on is a fundamentally human endeavour. AI tools, however cleverly designed they might be, do not provide anything more than abstractions (or re-presentations) of experience. An AI tool lacks empathy and sentience–it doesn’t know what it feels like to give birth to a child, face a diagnosis of terminal disease, or simply enjoy nature, all of which are experiences that scientists may have themselves or that inform their work and mission. Fully handing over science to machines (which seems to be the direction implicitly advocated here) would be akin to eating the menu instead of the meal–it would be unhealthy (for science and society at large) and unfulfilling (nihilistic).

In a world where +50% of published “social science” research findings cannot be replicated, it is time to move on from peer review (whether human or machine), and on to objective and active confirmation of findings, claims and assertions. Dithering with AI as an alternative ignores the problem with academic publishing and (to mix metaphors) is merely rearranging the deck chairs on a sinking pile of…junk science.

“The only way to make the situation fairer is by ending the human-dependent review system.”
No, it is to banish for-profit academic publishing, which was historically introduced by a bunch of scammers.

My primary concern is one of financial prioritization. I agree that relying on unpaid labor to acquire peer review reports is unethical, especially given the size and success of the academic publishing industry. However, I disagree that the preferable option is to acquiesce to this injustice and say “Fine, we accept that we won’t pay reviewers. Instead, let’s just devote an equivalent (if not greater) amount of time, energy, resources, and money to developing better AI processes and removing humans from the equation altogether.” Why not fight the (far easier and ultimately cheaper) uphill battle of getting human reviewers paid for their time, rather than the much larger and less intuitive uphill battle of removing human reviewers entirely?

I have been a regular reader of Scholarly Kitchen for a while, and I’m frankly in awe of the depth of the research in this article. I’m currently trying to expand my knowledge around technological solutions for peer-review, and this is exactly what the doctor (pun intended) seems to have ordered.

I have one quick note regarding a potential error — I think you may have meant “ReviewAdvisor” (https://github.com/neulab/ReviewAdvisor#can-we-automate-scientific-reviewing) instead of “Research Advisor” where you mention, “Tools, such as ResearchAdvisor and a neural network-based one….”

Comments are closed.