Editor’s Note: Today’s post is by Avi Staiman. Avi is the founder and CEO of Academic Language Experts, an author services company dedicated to assisting academic scholars to elevate their research for publication.
One of the biggest challenges for researchers is ensuring that their study is written in a clear and compelling manner that successfully manages to turn their research results into compelling narrative and figures into impactful scientific discoveries. While solid research design and execution serve as the backbone of meaningful research, the form and style of communication also play a critical role in the acceptance, dissemination, and adoption of the study within academia and beyond.
EAL (English as an Additional Language) authors face a particularly uphill climb trying to convey their novel findings in English, their second (and sometimes third or fourth) language. Writing tools and author services help these researchers to improve the clarity of their arguments, free up their time to focus more on research, increase speed to publication, and gain confidence in their work. Most importantly, these tools offer EAL researchers (the vast majority of researchers in the world) a more level playing field whereby research is evaluated by the quality of its content and not the English skills of its author.
ChatGPT as Author
From the moment ChatGPT was released in November, researchers began experimenting with how they could use it to their benefit to help write systematic reviews, complete literature searches, summarize articles, and discuss experimental findings. I was therefore surprised to see that when addressing the use of GPT, a number of major publishers ignored the far-reaching implications and plethora of use cases, instead zeroing in on one particularly obscure issue, namely, ‘ChatGPT as Author’.
This response seems to be a knee jerk reaction to a handful of attempts to list ChatGPT as a contributing author, which many publishers wanted to reject before the trend gained traction. At least some of the cases whereby authors submitted articles authored by ChatGPT seemed to be theoretical experiments to test the limits of authorship and make a point. Science, Elsevier and Nature were quick to react, updating their respective editorial and publishing policies, stating unconditionally that ChatGPT can’t be listed as an author on an academic paper. Nature went as far as describing GPT as a ‘threat to transparent science’. GPT not being granted authorship seems banal enough and didn’t generate much pushback. As it is, not too many researchers clamor to share their authorship credit with their colleagues, not to mention a Chatbot.
However, publishers seemed to be answering a question that few were asking while avoiding other (more?) important use cases. Can or should authors use ChatGPT or other AI tools in the development and writing of their own research writing? Can it be used for conducting a literature review? What about analyzing results? Or maybe for drafting an abstract from an existing article? These are the important questions authors are asking where publishers seem to leave (too much?) room for interpretation.
Drawing a Hard Line in the Sand
Those publishers who addressed this ‘gray area’ differ in regard to the question of whether ChatGPT can be used for assistance in the research process and the level of detail and clarity of their policies.
After proclaiming that nonhumans can’t qualify for authorship, JAMA leaves some wiggle room (albeit hesitantly) for using ChatGPT for writing assistance. The guidelines require authors to describe the nature of involvement in detail.
Authors should report the use of artificial intelligence, language models, machine learning, or similar technologies to create content or assist with writing or editing of manuscripts in the Acknowledgement section or the Methods section if this is part of formal research design or methods.
This should include a description of the content that was created or edited and the name of the language model or tool, version and extension numbers, and manufacturer.
Science takes a much more black and white approach, essentially banning the use of text generated by ChatGPT altogether:
…we are now updating our license and Editorial Policies to specify that text generated by ChatGPT (or any other AI tools) cannot be used in the work, nor can figures, images, or graphics be the products of such tools. And an AI program cannot be an author. A violation of these policies will constitute scientific misconduct no different from altered images or plagiarism of existing works.
One notable exception is ACS, who seem to be taking a proactive approach to defining guidelines on the proper use of AI technologies. An ACS Energy ‘Viewpoint’ piece entertains the possibility that ChatGPT could employ an ‘assisted-driving approach promising to free researchers’ time from the burden of scientific writing and get them back to the science’. An editorial in ACS Nano outlines detailed best practices and policies for using AI tools.
What ACS understands is that GPT can be a good tool for ideation and writing while not being a source of reliable information. However, at least for now, they seem to be an outlier in the publishing landscape. It is also worth noting that COPE put out a position statement clarifying that authors can use the tools so long as they are properly credited and attributed. Making these distinctions and codifying them in policy seems to be a critical step moving forward.
A Precautionary Reaction?
Reading between the lines of some of these statements may reveal the fear of an onslaught of paper mill submissions. Traditional plagiarism checkers haven’t yet caught up to AI detection and even those tools that do some level of detection can be fooled by putting the AI generated text into paraphrasing tools. No publisher wants to be the next Hindawi or IOP and face mass retractions.
On the other hand, publishers would be wise to leave the back door open for authors to use AI tools in order to support their research for two reasons. First, strictly policing the use of these tools would not only be an exercise in futility, but enforcement could quickly become a nightmare. Second, an arms race seems to already be underway to build out software to detect AI writing. Publishers will likely spend ungodly sums of money on these tools, only to be set back by even better models that can outsmart the detectors. Whether that should be our focus is an important question to ponder before diving in headfirst.
Appearances Can Be Deceiving
When we dive a bit deeper, we find examples of publishers hard at work on AI-human collaborative projects that they see as the future of publishing. Springer Nature recently announced their first AI-augmented research highlight and there seems to be much more to come. Cureus, (also a Springer Nature publication) is running a contest calling authors to submit GPT-assisted case reports and JMIR Publications recently wrote a call for papers for a special issue using GPT.
In order to make important decisions on the right response to GPT, we need to revisit a number of fundamental questions about how we understand issues in science including the definition of authorship, what we consider plagiarism and the nature of research writing. If we can come to some consensus on the values that drive science publication forward, then we have a shot at developing meaningful, sophisticated policy.
What is Authorship in Research?
In order to answer this question, we need to get at the heart of how we conceive ‘authorship’ in the research context. We only need to look at the contentious topic of authorship inclusion and order on multi-author papers to understand how quickly the question of authorship becomes entangled.
Authorship as accountability
Nature editor-in-chief Magdalena Skipper argues that “an attribution of authorship carries with it accountability for the work, which cannot be effectively applied to LLMs”. If researchers are coming with their own results, ideas and conceptions of their field and get help with putting it all together, does that make their research no longer novel or are they no longer accountable? Does authorship mean that the listed authors have to write every word, format every reference and insert every comma or that they take responsibility for the end result?
Could authors potentially take responsibility for reviewing and vetting ideas produced by LLMs? Even if GPT generates some of the data or narrative for a study, the human ‘prompt engineers’ would still assume the burden of a) the prompting itself and b) ensuring veracity of information through their own critical review and revisions.
WAME recommendations posit that chatbots cannot meet ICMJE authorship criteria, particularly “final approval of the version to be published” and “agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.” However, this definition requires a bit of honest reflection and introspection. After all, there are many author contributors to articles (such as students) who don’t have final approval of the version to be published and aren’t accountable for “all aspects of the work”.
Authorship as substantial contribution
In addition to accountability for the work produced, COPE guidelines call for “substantial contribution to the work”. In the CREDIT taxonomy of author contribution, there are 14 different ways to make a significant contribution that merit authorship in a paper and I would argue that ChatGPT could make a significant contribution in at least 10 of them. Could authors potentially contribute significantly and be held accountable while also heavily relying on GPT as both an initial source of information and authorial personal assistant?
What Is Plagiarism in Research?
One of the most fundamental principles of scientific authorship is that ideas, words and concepts need to be either original or properly attributed. The word plagiarism has its roots in the Latin plagiarius, “kidnapper”. If texts aren’t original, they constitute plagiarism, a cardinal sin in academic publishing. Therefore, it behooves us to ask: should generating and using AI-assisted text constitute plagiarism?
The answer to that question depends on exactly how ChatGPT works and how plagiarism is defined. Regarding the former, it seems that the risk of unknowingly copying large chunks of texts directly from other sources is unlikely, although small fragments of sentences may indeed be found elsewhere.
More importantly, we need to define plagiarism. If we understand plagiarism to be the use of materials that authors don’t write in their own name, then AI-assisted writing should be considered plagiarism. However, if plagiarism requires taking ideas from others and passing them off as your own, then GPT may not be considered plagiarism as these texts are new and not ‘ripped off’ from someone else. Academic writing inherently includes building a scaffolding of previous work around which the researcher can add novelty.
What is Writing in Research?
If we are going to heavily regulate GPT use, then maybe we should clamp down on the use of scientific ghostwriters who do much of the legwork in the writing process (especially in industry) and rarely receive acknowledgement, forget authorship for their contribution. We also need to consider whether we want to ask authors about other forms of AI writing assistance help they use. Tools such as Grammarly, Writeful, and even Microsoft grammar checker are relied upon heavily by authors. If an author is using GPT for language purposes, why would that need to be declared and other tools not?
Researchers across many fields use software and tools to collect, organize and analyze their data without anyone blinking an eye. What makes our visceral response to writing so very different?
Alternatively, what if authors get their ideas for new research from ChatGPT or have GPT analyze their results but write it up in their own words; might that be ok because the author is technically doing the writing?
I believe that self-respecting researchers won’t use GPT as a primary source the same way they don’t use Wikipedia in that manner. However, they can use it in a myriad of other ways including brainstorming, sentence construction, data crunching, and more. The onus of responsibility for the veracity of information still falls on the researcher but that doesn’t mean we should run to ban because some might use it as a way to cut corners. Adopting a hardline policy, such as the one taken by Science, seems to ignore the vast majority of researchers acting in good faith who simply want to use these tools to further their work.
The Cost of our Fear: Missing the Opportunity to Level the Playing Field for EAL Authors
Publishers who ban GPT are missing a unique opportunity to level the playing field for EAL authors. GPT can be used in many ways to help improve scientific writing in meaningful ways and restricting its use leaves those authors at a significant disadvantage. On the flip side, publishers who seize the opportunity to build on these tools to serve their international communities can engage meaningfully with large audiences that were previously inaccessible or largely unengaged due to their linguistic disadvantage, leading to greater diversity of representation.
Not only that, a hardline approach could actually boomerang and lead to a decrease in author transparency. Here are just some of the considerations worth noting:
- A recent Nature poll found that 80% of authors have already ‘played around’ with GPT. Many of those same authors won’t know what the particular publisher’s policy is and may unknowingly “constitute scientific misconduct no different from altered images or plagiarism of existing works” .Do we really want to criminalize researchers who don’t read the fine print terms and conditions?
- ChatGPT has already been integrated into Bing Chat and will soon be integrated into Microsoft Word. Banning GPT may soon mean banning the use of search and word processors. It is also very hard to define exactly how GPT is used in a particular study as some publishers demand, the same way it is near impossible for authors to detail how they used Google as part of their research. WAME’s guideline to “declare the specific query function used with the chatbot” seems to demand an unrealistic burden of documentation.
- Researchers in many fields are already using a myriad of AI tools such as Elicit, Semantic Scholar, and Scite. Do we need to go back and retract those papers because they used AI tools without proper attribution?
Conclusion: What do we really want our scientists doing?
The proliferation of powerful AI tools pushes us to ask fundamental questions about how we perceive the role of scientists in general and the specific role writing plays in their work. In other words, to what degree should we even care that authors write every word of their research in the first place?
I would argue that, at least in the STEM context, the writing process is a means to an end of conveying important findings in a manner that is clear and coherent. If we can do that faster and cheaper, then maybe we should pause to consider the potential benefits. Would we have held up lifesaving Covid research because the PI got help for an AI tool?
There may be room for an important distinction between HSS and STEM work in this context. I can see justification for stricter policies in cases where the act of writing constitutes an essential part of the research or in ethnographic and qualitative studies where the author’s role impacts the nature of the study. Additional thought must be put into how we legislate the use of AI tools and looked at on a more granular level.
As a basic premise, I suggest that publishers encourage researchers to use all tools at their disposal in order to make their work as accessible and impactful as possible, while continuing to educate researchers on how to find, review, and verify information. At the same time, publishers need to fast track peer review reforms to be ready to contend with even murkier lines between novelty and regurgitation — and fact and fiction.
Special thanks to Roger Schonfeld, David Crotty, Amy Beisel, Adrian Stanley, Sally Wilson and Judah Bernstein for their insightful comments on various drafts.
27 Thoughts on "Guest Post — Academic Publishers Are Missing the Point on ChatGPT"
It’s too early to advise how anyone should feel or go about AI. We don’t know enough:
“AI systems with human-competitive intelligence can pose profound risks to society and humanity,” the letter warns. “Powerful AI systems should be developed only once we are confident that their effects will be positive and their risks will be manageable.” – https://news.sky.com/story/elon-musk-and-others-sign-open-letter-calling-for-pause-on-ai-development-12845039
Thanks Emanuel. Admittedly, I don’t dive into the more general ramifications of AI on society and humanity which is a critical conversation that seems to only be in its infancy. This should definately be addressed in future posts.
Admittedly, I don’t discuss the more general ramifications of AI on society and humanity as a whole which is a critical issue that is just starting to make its way into the general consciousness. Definitely deserves more attention in further posts.
Thanks for the extensive writup Avi! It seems to me the Science one is a bit over the top. There’s a large gray area in text generated by ChatGPT (or any other AI tools) that will only become larger (and increasingly more impossible to track down). Thinking of Writefull helping you create a catchy title for your article or Scholarcy helping you with writing your abstract or any of the new, neither can be traced at the moment, add to all the new to come OpenAI tools in Microsoft… Not to mention all ghostwriters working for pharma companies writing their papers… Lastly, a well written paper has more downloads and citations so won’t a tool that helps authors (in varying degrees, thinking of non-native speakers in particular) write an article not do science a favour rather than a threat?
Thanks for the close read Martijn. I believe the editors at Science have admitted that they are taking an overly-cautious approach for now as they believe that will be easier to roll back than a more lenient policy (and I can respect that even if I disagree). My main issue is that the general responses from publishers have come in the form of one-line policy updates, leaving authors very confused about what they can/can’t, or should/shouldn’t be using AI for. Also, they are generally framed in the negative (what isn’t allowed) and the consequences of infraction.
I agree that the responsible use of AI would be good for scientists and publishers at the same time. Maybe we need to come together as an industry to define what responsible AI use looks like (in detail) instead of obsessing over “GPT authorship” and building detection tools. Instead of trying to figure out what to ban, why don’t we proactively look at some of the great technologies coming from the publishing start-up scene that can be used as tools for responsible AI use in scientific publishing?
Sounds like an excellent topic for an SSP/STM conference session! Happy to co-host 😉 I think the discussion has to take place at one point in the near future before all doors are locked overly-cautious approaches. Surely Adam Day would be in as well as a co-host.
I’m in! Just tell me where, when and what to wear 🙂
Sounds fun! 🙂
I wonder if you have statistics to back up that a well-written paper has more downloads and citations? I want to believe that this can be backed up by data!
On the accountability question — I think there are legal issues here that come into play. As an author, I have to vouch that to my knowledge, this is my work, that it is original, and that I indemnify the journal from any damages the work causes. You can argue that an author may be limited in their ability to vouch for coauthors, but legally, a non-human certainly cannot make such assertions in any legally binding way. That’s a big reason why ChatGPT can’t be an author — who do you go to if questions are raised about the work?
100%, there needs to be accountability and responsibility for the content produced. I think what is unique in this case is that there is a significant contribution by an entity that can’t take responsibility. I would like to see more from publishers describing the quality control checks they expect authors to make when using these tools (being as they will likely be using them regardless) and not leaving it up to author discretion.
As for the other legal ramifications, it seems like we may be on the verge of some of the biggest litigation the world has ever seen: https://www.bbc.com/news/technology-65139406. These will need to be fleshed out in our industry as well.
Looks like we’re all thinking about the same things! https://medium.com/p/797bc64e2632
“if plagiarism requires taking ideas from others and passing them off as your own, then GPT may not be considered plagiarism as these texts are new and not ‘ripped off’ from someone else”
Surely ChatGPT, or any other Language Model, can only encode and repeat what people have written (verbatim or otherwise)? Where do the ideas come from if humans didn’t write them?
Thanks Adam, I wish I had seen your post before publishing 🙂
I guess my question is to what extent is there a difference between an LLM that can ingest information from different sources and produce a compiled output with the human brain that functions in a somewhat similar manner by processing information, experiences, and ideas and then producing novel ideas based on the sum total of information learned (albeit with much more room for creativity).
Yes, I’ve been thinking about that a lot, too. It’s really hard to draw a line between what is and isn’t ok for the machines to be doing. I also wonder how the existence of this technology is going to change people’s values and expectations around re-use of ideas. I guess we will have to wait and see! I thought this was an interesting example: https://www.tomshardware.com/news/google-bard-plagiarizing-article
What I think is crucial is for us to have these conversations in the publishing industry and not assume authors will know what to do or behave in the manner we want them to. Also, they should be involved in these conversations to make sure they are on point. It seems as if content creators and publishers are hunkering down while consumers are thrilled to have such free access (and many of us are both). Should be an interesting showdown.
“Surely ChatGPT, or any other Language Model, can only encode and repeat what people have written (verbatim or otherwise)? Where do the ideas come from if humans didn’t write them?” Good question. This is the difference between Artificial General Intelligence (AGI) and a Large Language Model. It is believed that the first one might lead to a neural network and perhaps even containing something that resembles a conscience. That it will be sentient if you will. In that case it would be possible for the AGI to judge entirely new situations on the basis of the experience it has with ingesting all human knowledge plus the prompts provided by the AGI users, the so-called feedback loop. It would then ‘decide’ (on the basis of the past) what the best next steps are when having to deal with that new challenge. The latter can end up being really good and even innovative advice, or something quite objectionable. So some of these models might become more than merely a tool. They might become sentient entities and find better ways of dealing with things…or worse.
What about AI tools for translation? How might this play out if AI-assisted or AI outright translation becomes reliable and ubiquitous?
I expect this will indeed become the case over the next few years. That being said, I think the level of specialization required and the specific writing style of academic authors (especially in HSS) might make academia one of the last places where the level of quality for such tools is high. Machine translation needs a lot of input to learn and research, by its very nature, attempts to be novel and non-repetitive.
Are there specific concerns you have about AI-assisted translation?
My problem with ChatGPT, in particular, was that I made a simple request, to describe the original research on the description of a protein, and it gave me back a transparently false answer.
When I repeated the request, asking for references to the published literature to support the claims in the answer ChatGPT gave me three “literature references” that did not exist: real journal names but there was no such article when I went to the table of contents of the journal.
So, if one can not trust ChatGPT to not create fictitious “information” in such a simple and easily checked case, then I do not have any confidence that it is not adding fictitious material that can not be so easily verified.
It doesn’t seem like a good idea to have a co-author that does such things, and is therefore inherently untrustworthy.
And I am leery of using such a tool for doing literature research if I then have to check everything it says in detail.
Maybe someday a useful tool but right now, at least for my purposes, it is just a glib teller of untruths.
I think this speaks to the fundamental misunderstanding that so many have about ChatGPT — it is not a discovery tool, nor is it meant to provide information. As I wrote a few weeks ago, it is a word prediction machine:
Or as author Neil Gaiman recently put it, ChatGPT doesn’t give you information, it gives you information shaped sentences. When you ask it to describe original research, it gives you a set of words that look like what people’s answers to that question would be, not an actual answer to the question. The same goes for references — what you’re getting is not a set of references, but a set of the sorts of words that would be in an answer to that question.
Which is why its value is, as Avi points, out, more about writing and shaping information that already exists, rather than discovering that information or synthesizing new information.
Yes, I get that.
I have not seen a way to apply ChatGPT only to information that I provide, though.
Ask it a question, it tells you something, but you can not trust it.
I suppose if you could give it a corpus of knowledge and tell it to write an engaging narrative that encompasses that knowledge, and nothing else, that would be great, but it looks to me like the essence of the thing is the statistical model that arose from its training, and that will always intrude on the content of its output.
It is not a truth-dependent or reliability-checking writing and shaping machine – that isn’t at the core of its functionality.
We had a post recently that looked at the use of ChatGPT for writing research summaries:
But my take on it is that it’s a useful tool for speeding up some processes in areas where you have strong knowledge and can double check its work. Not so great in an area where you might not be able to catch introduced errors.
Yes, I agree with David that GPT should mainly be used in cases whereby authors have the ability to oversee and do quality control on the outputs. My fundamental assumption is that most scientists can make these judgment calls when it comes to their own line of research and should never rely on outputs blindly. One of the reasons I wrote this in the first place is that I didn’t think the publishers had made this distinction clear enough.
Not so great in an area where you might not be able to catch introduced errors.
Like the student assignments many of us are now receiving.
Are you saying that professors are unable to detect falsehoods in the writing they receive from students? Should you be teaching something you don’t know well enough to determine fact from fiction? Yes, there are other issues that teachers have with these technologies writing papers rather than their students doing so, but one would hope that recognizing blatantly false information is not one of them (and is actually an advantage, for now, in catching these cheaters).
Writing is fundamental to all aspects of science and should not be handed over to a machine. Writing and rewriting and editing papers not only allows for mistakes and insights to be found in the work, but it provides needed writing experience for grants and other writings crucial to the scientific process (lab notes, presentations, …). One of the most important ethical considerations with a publishing research–an author must be a significant contributor to the research–is glossed over with the comment that in practice this isn’t always the case. Absolutely true, but we don’t stop striving to keep that rigour because of some bad apples. Also, it is well acknowledged that there are inequities around English as the dominate scientific language, but machine writing does not address the fundamental issues and that’s what we should be spending resources on, not making tech companies richer while they destroy the planet.
I guess it begs the question: should writing be such a critical part of the scientific process? if so, why?