This article is the first of a two-part series on AI disclosures by authors in scholarly articles. In this piece, I make the argument that authors aren’t complying with AI declaration requirements and lay out some of the reasons that may be the case. In a follow up piece, I will provide suggestions for how publishers can constructively work with authors to increase transparent AI use. 

When the initial hysteria over ensuring AI didn’t become an author died down, publishers quickly understood that they needed to carefully walk a fine line in response to authors’ use of AI tools; they couldn’t reject AI use outright, but also couldn’t allow for it to be used unchecked. Most publishers subsequently put out vague and simplistic ‘declaration requirements’ in the hope that author transparency and declarations would become a panacea for issues or hallucinations that were ultimately published. Some went as far as suggesting that failure to report should be considered research misconduct

The implicit expectation was that if authors declared to journals how and where they used AI tools as part of the research process, editors and peer reviewers would be able to critically consider those uses and make a determination regarding their efficacy and appropriateness. This was expected to enable innovation while preserving the integrity of the published literature.  

However, journal editors I speak with report a fascinating (albeit somewhat expected) development. They see telltale signs of AI use popping up throughout submissions, including, but not limited to, hallucinated references, tortured words and phrases, near-perfect English that lacks substance, extensive em-dashes, and more. Editors find themselves spending considerable time sifting through and rejecting subpar and AI-generated work, which is not always discernible at first glance. 

With 62% of researchers reporting using AI at some point in their research and publication process, we would expect editors to see a plethora of AI declarations and disclosures. Yet, only a negligible percentage of authors seem to actually be disclosing their AI use. Emerging research seems to show that there is a wealth of published papers in which authors use AI without disclosure. One study randomly reviewed 200 articles and only found two author declarations. This early research reveals a vast gulf between guidelines and compliance. 

hand writing in notebook with pen with an overlay of AI related graphics

Why are researchers not disclosing their AI use?

Over the last three years, I have travelled across Europe, training over 30,000 researchers on the responsible use of AI in research. I often ask them if they report their AI use when submitting to journals, and if not, why. They often confess that they are using AI for a variety of tasks across the research and publication process a lot more than they like to admit. Several key reasons help explain authors’ non-compliance and warrant attention from publishers. 

#1- Fear of negative repercussions

The main reason researchers don’t report AI usage is out of fear that it will be held against them. Given the numerous gatekeepers that any submission passes through (desk editor, chief editor, peer reviewers), all it takes is for one individual with a negative view of AI to recommend rejection. Ironically, many researchers seem to have embraced AI for themselves, but still perceive it as a tool for cutting corners or as a real threat to research integrity when employed by their peers. Much like deciding whether one is “fit to drive” after having a few drinks, researchers trust their own judgment about their own AI use but not that of others. So long as using AI in research is stigmatized, there will be no breakthrough with transparent disclosures and reporting.  

Even when publishers present themselves as neutral, authors may still fear that disclosing AI use could quietly shape editors’ perceptions of rigor, originality, or scholarly competence. The result is an uneven compliance environment, where those with the least influence feel the strongest incentive to withhold disclosure.

#2 – Lack of clarity about how/what/when/where to report

Publishers and journals vary widely in their standards for reporting the use of AI tools, leading to confusion among authors. For example, many journals require only general use disclosure without extensive documentation, while others demand creating new appendices to accompany the article, including screenshots of any chatbot interactions. The ACM guidelines, for example, call for a comprehensive submission of all AI chats related to the research. One early-career researcher I spoke with described the personal determination he makes regarding whether to report; if he uses AI to perform a task more efficiently, a task he could have done himself, he doesn’t report it.    

#3- Burden lacking incentive

Because reporting standards vary so widely across publishers and journals, authors see new reporting requirements as an undue burden for which they have neither the time nor the energy.  Often, it is unclear where and how authors are expected to report AI use, adding to their existing frustration with editorial systems. Proper disclosure under the current guidelines can easily turn into hours of additional work. As we have learned regarding data sharing, meaningful compliance can emerge only if publishers properly incentivize good practice and/or punish bad practice. 

Even if researchers are willing to disclose AI use, they aren’t always certain which use cases are required. Is reporting required if AI is used to help locate relevant literature for the Introduction? Do researchers now need to disclose using Google Scholar Labs but not traditional Google Scholar? What if they used AI to help phrase a section of the discussion? What if the entire research question was AI’s idea in the first place? Most guidelines don’t go into this level of resolution, leaving researchers unclear on expectations. 

#4- Not aware AI is being used 

Many major technology providers have incorporated AI into existing products in clever ways that are nearly seamless. Universities I train often have the full Microsoft suite and enjoy Copilot AI as part of their default search, email drafts, and text drafting. Authors can easily use AI within any of these platforms without being fully aware that they are employing AI. Even if they do know, documenting and recording every step of the creative process when in the midst of research is rarely foremost in their minds, and might even inhibit creativity. For early adopters, documenting AI use is approaching the point of attempting to document Google searches in the research process.

#5 The confusion between AI and plagiarism

Many researchers I speak with perceive AI use in their writing as analogous to plagiarism. As a result, authors tend to focus on making AI use undetectable (rather than transparent) through paraphrasing and purging the texts of their obvious AI signatures. At Academic Language Experts (the author services company I own and manage), we are seeing a surge in requests to de-AI texts that have been written or translated using AI, either because they fear that they will fail AI detection tests or that a reasonably discerning editor will understand that AI was used. 

#6 Policies lack teeth

Some authors know that publishers don’t carefully check for AI signs in their research and are willing to take the risk of using AI tools without disclosure because of the benefits these tools provide them. This confidence is reinforced when such use goes unchallenged.

Considerations for publishers moving forward

A strong push should still be made for robust AI education and guidelines by publishers (see Wiley’s admirable attempt, for example). Some have argued that, due to the limitations on policy enforcement, disclosure should be voluntary. Regardless of whether publishers believe the brunt of responsibility should lie with them or with the authors, no publisher wants to be in the position of retracting batches of articles regularly due to integrity issues. At the very least, publishers can help authors and reviewers understand the pitfalls that AI presents and ensure the replicability and reliability of the research. STM should be applauded for publishing an initial classification of AI use cases, but it is still unclear how these should be conveyed to authors, what use cases each journal allows with and without disclosure, and whether this classification can keep up with ever-changing models and tools.

There is one solution I encourage publishers not to adopt: investing in AI text detection tools. Not only are these tools notoriously unreliable and slow to adapt as AI models improve, but they also reinforce the idea that using AI to help with writing is forbidden, a position not taken by most publishers. If we believe that AI tools will help ESL scholars level the playing field for publication, why are we so obsessed with trying to detect their use? In commercial publishing, proposals have been made to differentiate between AI-generated and AI-assisted work, but it is very challenging to define where one starts and the other stops.  

Publishers need to put serious thought into how, when, and in what way they want authors to declare their AI use; otherwise, we are likely to continue seeing minimal disclosures. And publishers need to state clearly whether or not AI use will impact the review of their papers. In a follow up article, I will lay out my suggestions and framework for how publishers can be more accepting of AI use while not sacrificing research integrity.

 

My heartfelt thanks go to David Worlock, Thad McIlroy, Chhavi Chauhan, Aaron Tay, Susan Doron, Chris Leonard, Nikesh Gosalia, Sara Falcon, and Chirag Patel for their extremely constructive and productive comments on previous drafts. If you learned something new reading this, they deserve most of the credit. 

Avi Staiman

Avi Staiman

Avi Staiman is the founder and CEO of Academic Language Experts, a company dedicated to empowering English as an Additional Language authors to elevate their research for publication and bring it to the world. Avi is a core member of CANGARU, where he represents EASE in creating legislation and policy for the responsible use of AI in research. He also is the co-host of the New Books Network 'Scholarly Communication' Podcast.

Discussion

8 Thoughts on "Why Authors Aren’t Disclosing AI Use and What Publishers Should (Not) Do About It"

In my opinion, publishers should stop putting these shackles and judge the submission by its quality, independent of AI use. If it’s a good and original submission, who cares about AI use? It’s a lost cause anyway. However, there are 2 related consequences, and I don’t have an answer on how to deal with them.
1. There’s an overflow of low-quality submissions – full of what this article calls “near-perfect English that lacks substance”. I myself, as a reviewer, recently rejected 3 manuscripts, submitted to top Q1 journals; until recently, I’d request a major review, but it seems that the quality is dropping. This advantage of leveling the playing field in terms of the English language opens the “gates of hell” for mounds of bla-bla-bla submissions.
2. Which brings us to the 2nd, related, issue: peer review. I still do it the “old way”, reading, analyzing and checking the references, writing feedback for specific lines, formulas, variables and parameters, figures, and so on. (This way I “caught” many issues in the papers I rejected.) I don’t see another way – even if the publisher does some AI pre-processing, human review is essential to maintain the level and integrity of research (IMHO). And reviewing is an unpaid task, requiring a lot of time, if it is to be done properly. Concerning one of the papers I rejected, I received a feedback letter from the publisher, informing that the paper was rejected. It included feedback from 3 reviewers. I was #3 and rejected the paper – the paper was really in my specialty field, and I knew what I was talking about. #2 did a reasonable job and recommended a major review. #1 just copied-pasted a general paragraph from a chat-bot – could have been used in any topic – and accepted the paper outright! I never saw such a lousy job. How can publishers find good reviewers (and they’ll need lots of them)? I don’t know.

In principle, I agree that we should ignore tools used and only focus on outputs. However, I think there are AI tool uses that even good human reviewers like yourself won’t be able to detect. These can be intentional cases of manipulation or unintentional errors. If there were real consequences to problematic data (AI and otherwise) researchers would think twice before relying on tools they don’t always understand or can’t do quality control on.

This is exactly what I was thinking – the author is responsible for the accuracy of every word in the manuscript they submit so it doesn’t matter if they used AI or not because they can’t “blame” AI for mistakes. I am in the “AI output cannot trigger plagiarism charges” school on this.

I am starting to wonder, however, if GenAI is exposing a fundamental weakness in the scholarly publishing industry. That weakness is a lack of thorough editorial checking. That there might be an assumption of “competence” for things like correct references so the industry doesn’t bother to check carefully enough. But really they ought to have been doing that all along. I’m thinking of how a journalist who works for a major paper expects every fact they claim in their article to be “fact-checked” by their editors before publication. A lot of us, at least when we were more naive, assumed a similar level of scrutiny was being applied to peer-reviewed manuscripts, and genAI is exposing the reality that it is not at all. That will definitely raise more questions about why the heck we’re (libraries) paying so much for these subscriptions if the publishers aren’t doing a decent level of reference-fact checking.

Since I’m not going to persuade the industry to spend more of their profits on quality control, I have an easier idea to offer.
How about creating an industry standard for submitting references in an easily checked format so the publishers can just run them against a program that checks them? I’m thinking something like RIS format. The program to check them against reality would be trivial – I could probably write such a program in a few hours (with Gemini’s help, ironically) that would have close to 99% accuracy in terms both false positives and false negatives. I can’t imagine that would be too much of an extra burden for authors, who likely are already using something like Zotero, Mendeley, RefWorks, etc. to manage their reference list and it would be trivial to export the list to RIS.

I agree that our issue here is, in some ways, the revelation that publisher QC is a lot less rigorous than we would have originally assumed.

I understand your idea about checking references, but my fear is that the real issues lie in the underlying data and results, not in the supporting literature. Therefore, I’m not sure a focus on the reference literature is necessarily where time and effort should be focused.

I can’t help thinking this is a bit like the British in India trying to get rid of cobras by paying locals per snake. They got a snake-farming industry instead! We can play cat-and-mouse with AI detectors and rewriting, but that won’t help AI accelerate research. We can judge papers on their merits, but if you’re facing a 10x increase in submissions, that’s more of a nice idea in theory than a practical plan. Given that more researchers are going to use AI, publishers won’t always be able to tell, and you’re going to see a higher volume of submissions, what do you do? Arxiv and sci-fi publisher Clarkesworld have already had to respond and they’re basically only taking submissions from authors they already know or who can be vouched for by an author they know. This is not the open democratic system we’d like to have – more like the old boys club. I’m sure some will go to AI-assisted review. Better get creative if those are not the solutions you want!

It has to be part of the answer on our current trajectory, but I don’t think it will be enough to keep evaluators – funders, publishers, etc – from increasingly relying on proxies like personal reputation and institutional reputation, thus increasing insularity at a time we desperately need to decrease it. I wish I had a clever idea. Staffing up 10x to handle more content is a fine idea too and I think the publisher who does that will gain a lot of goodwill, if they can stay in business long enough.

Maybe we just turn evaluation over to AI too and become essentially just a collective set of arms carrying out the wishes of our AI overlords. That’s the approach the big AI labs like OpenAI and Anthropic have taken (see https://www.anthropic.com/research/how-ai-is-transforming-work-at-anthropic ). Given that human attention is finite and we have yet to find any limit on how far AI agents can scale up, it seems like a denial-of-service attack on everything that requires human oversight is not implausible.

Leave a Comment