Guest Post – GPT-3 Wrote an Entire Paper on Itself. Should Publishers be Concerned?

Editor’s Note: Today’s post is by Saikiran Chandha. Saikiran is the CEO and founder of SciSpace — the only integrated research platform to discover, read, write, and publish research papers.

Research indicates that businesses have been relying more on artificial intelligence (AI) over the past few years. The average number of AI capabilities that organizations have embedded within at least one function or business unit doubled from 1.9 in 2018 to 3.8 in 2022.

And it is not just businesses; even the general public is taking notice. AI models like GPT-3, DALL-E, ChatGPT, and Alphacode have been the talk of the town on social media. So, it’s no wonder that advancements in generative AI are now having an effect on science and academia as well. A researcher got GPT-3 to write an entire paper with simple prompts. The paper was initially rejected by one journal after review, but it was subsequently submitted to and accepted by another — with ChatGPT being listed as an author — a trend that’s becoming more common these days.

GPT-3, or Generative Pre-trained Transformer 3, is a Large Language Model that generates output in response to your prompt using pre-trained data. It has been trained on almost 570 gigabytes of text, mostly made up of internet content from various sources, including web pages, news articles, books, and even Wikipedia pages up until 2021.

So when you enter a prompt in natural language, it uses the training data to spot patterns and then gives you the most appropriate response. You could use it to complete sentences, draft compelling essays, do basic math, or even write computer code.

In this article, we will discuss the impact of GPT-3 and related models on research, the potential question marks, and the steps that scholarly publishers can take to protect their interests.

The impact of GPT-3 on academic research

The model has been around since 2020 and has already been used to develop a range of new applications, such as chatbots, translation tools, and search engines, among others.

Perhaps the most talked about feature has been its ability to draft human-like essays. You could generate an original piece with a basic prompt like “draft an academic essay in about 800 words on how AI is impacting academia.”

GPT-3’s deep learning algorithm allows it to write from scratch, auto-complete your sentences, and build upon what you have already written. Microsoft plans to integrate the model into its productivity suite, which includes MS Word, Powerpoint, and Outlook. So far, they have added a version of the model to their Edge browser, Bing search, and Teams, their collaboration tool. Other tech giants, such as Google and Amazon, are also pushing forward in the AI space. Google is set to launch Bard, a conversational AI service, while Amazon Web Services is partnering with AI startup Hugging Face to create cost-effective AI applications for customers.

Now, this is only a tiny part of the GPT-3 story. Since the release of its open API, research labs and companies worldwide have been building new applications powered by the model. ChatGPT, a sibling model of GPT-3, is completely changing how we interact with machines. Its dialogue-based approach has got the public hooked, as it enables them to get things done with prompts and questions in plain language — a change from menus, buttons, and predefined commands.

These developments are sure to change the writing workflow. Identifying who wrote what — human or AI — will become challenging. Turning ideas and thoughts into fully fleshed-out points on a document won’t be the same as before with AI working alongside us.

Bringing this technology to academic circles raises complex questions. Can GPT-3 be listed as an author? How does copyright play a role in this? What about the ethics of such usage?

On the plus side, non-native English speakers will have an easier time overcoming the language barrier. They will be able to produce high-quality research papers without worrying about grammar or syntax issues. Moreover, AI-assisted writing can help researchers save time, allowing them to focus on refining their ideas, framing their arguments better, and conducting more in-depth analyses.

Additionally, you can even command the model to format the output in a certain way. And manuscript formatting is an activity that typically takes up to 14 hours per paper.

In short, these capabilities allow researchers to complete their manuscripts much faster and share breakthroughs with the world more quickly.

This has led to the creation of a whole new range of applications: from developing spreadsheet formulas and creating Python code to writing SQL — all from simple text prompts. That’s not all, you also have tools to help you with your literature search and reading process.

At this point, I’d like to disclose that I run SciSpace. We recently added an AI assistant to our research paper repository. It helps break down, summarize, and translate research papers, as well as explain math, tables, and text. And we’re not alone. There are also other tools out there that can help extract more information from research papers.

Beyond that, there are also models like DeepMind’s AlphaFold that can make protein structure predictions and OpenAI’s Codex that can solve complex university-level math problems or provide coding assistance.

The reasons to be wary of GPT-3

The critics of GPT-3 have raised numerous questions about the output generated by the model, from plagiarism and bias to a lack of reliability. And rightfully so.

A 2021 investigation into articles published in Microprocessors and Microsystems revealed that the journal published nearly 500 questionable articles. The study showed that they contained broken citations, scientifically inaccurate statements, and nonsensical content, rendering the papers non-reproducible. The investigators believe that authors may have used GPT and reverse-translation software to hide plagiarism and to enlarge their manuscripts.

Another concern is the potential for bias in GPT-3 generated output. The model is trained on unstructured web data. So it can easily borrow from existing stereotypes and beliefs about various subgroups, like races, political ideologies, religions, or genders. Past investigations have revealed instances of severe bias, leading to offensive output. So using these models for research purposes has the potential to pollute science with discriminatory language and unwarranted homogenization.

The model was trained on data from 2021, so unless you give all the right pieces of information in prompts, it might provide you with outdated output. Also, GPT-3 tends to hallucinate, meaning it produces an output that doesn’t make sense or isn’t true. For instance, when you ask questions about a particular theory and why it was derived, the model may respond with something completely unrelated or nonsensical.

Why does this happen? It comes down to the fact that the internet contains our thoughts, data, and facts but not the reasoning, logic, or context to truly make sense of them. So, GPT-3 has no way of knowing what is true or correct or why something is the way it is. And the model ends up producing probabilistic output without understanding the context around the question.

One way to avoid this issue is to use the chain of thought prompting technique, which involves providing the model with examples and instructions that help decompose the problem into smaller steps, eventually leading to the correct answer.

There are also other ethical and moral concerns. Is it right to use AI to write papers when publishing papers are used as a barometer of researcher competency, tenure, and promotion? Also, if an author uses an AI tool to write papers, does it mean the tool should be credited for the work and not the writer?

What can scholarly publishers do?

First, it is crucial to recognize that:

Large parts of academia run on publish-or-perish mode
Paper mills and predatory journals are not going away
English language dominates the academic and scientific discourse

GPT-3 and other AI models are evolving and hold tremendous potential for academia. However, writing-related AI technologies aren’t new — Google Docs, MS Word, and mobile keyboards have provided word and phrase suggestions and spell checkers, and grammar corrections for a while now. GPT-3-powered writing tools are now taking it further: rather than offering a list of words to choose from, they enable AI to anticipate and finish entire sentences and paragraphs probabilistically.

But at the same time, scholarly publishers need to protect the integrity of their journals from manipulation, disinformation, plagiarism, and bias.

Here are some steps that publishers can take to ensure their continued success in the face of the changes brought about by GPT-3:

Use AI tools for quality control: Integrate AI tools into your internal screening workflow as the first line of quality control. Use them to determine if the paper meets the journal’s scope, detect text overlap and plagiarism, detect formatting and grammar errors, and assess the appropriateness of experimental design. It should help editors and peer reviewers deal with the deluge of submissions, trim their workload, and focus on the most relevant papers.
Establish a clear framework: Formulate policies around the usage of AI tools. It should outline the acceptable research methods, the ethical standards that authors must adhere to, and the consequences of non-compliance. Moreover, if a publisher plans to use AI tools in their workflow, say to locate relevant peer reviewers, then they must clearly outline how to reduce the risk of bias or prejudice in the process.
Monitor existing papers: Take the help of research integrity experts, AI sleuths, and AI image detection tools to ensure that published articles are free of image duplication fraud, nonsensical content, or tortured phrases. Retract the papers that fail to meet the standards of your journal.
Educate authors: Research paper writing and submission are tedious activities. Often researchers may need help with what to do. Create a blog or a YouTube channel and use that to address these knowledge gaps and ambiguities. Also, use that to build awareness of paper mills, predatory journals, and the ethical and moral implications of using AI tools. Tap into existing resources created by organizations like the COPE and CSE, who share practical advice and assistance around publication ethics, to ensure submissions align with accepted standards.
Offer additional services: Since most papers are published in English, non-English speakers are forced to write English for academic success. Many see this as a burden, making communicating new ideas and insights difficult. Publishers can turn to AI-enabled translation tools, like DeepL, to capture the subtlest nuances of language and retain those nuances in the translation. This will enable them to receive more submissions, get publications ready faster, and ensure that non-English papers remain true to the original intent.
Encourage Open Access: Urging authors to archive their pre-print in a repository like ArXiv or share their datasets in Zenodo will help promote transparency and openness. The greater visibility will lead to more call-outs and expose any suspicious behavior. For paywalled papers, publishers should have a dedicated internal team verify the raw data, seek reader feedback and monitor the web for commentary to ascertain accuracy and credibility.
Check the integrity of submissions: Make sure all the papers in the backlog run through a GPT Detector. It should help identify authors who use AI to shape the core theories of their manuscripts. Also, use databases like Dimensions, Scopus, and Web of Science to detect fake or made-up citations — common occurrences in GPT-3 generated papers. AI often cites papers that do not exist or are unrelated to the topic.

By following these steps, publishers will be better equipped to identify potential issues and establish policies that ensure the integrity of their publications.

Final thoughts

Given the pace of development, the role of AI tools in scientific research and communication will only increase in scope. The jury is still out on how good or bad the impact would be.

On the one hand, it could democratize research and knowledge. And on the other, it could worsen information overload and enable more people to take advantage of shortcomings in our educational systems, which often reward quantitative achievements.

Scholarly publishers and other stakeholders will need to carefully evaluate the impact of AI tools and take the necessary steps to ensure that its usage does not lead to fraudulent activities or unethical research practices.

Saikiran Chandha

Saikiran Chandha is the CEO and founder of SciSpace — the only integrated research platform to discover, read, write, and publish research papers.

Discussion

5 Thoughts on "Guest Post – GPT-3 Wrote an Entire Paper on Itself. Should Publishers be Concerned?"

Another tip: decide in advance whether you aim to copyright your work or use the materials for a patent application. The use of AI may complicate or even prohibit this. Especially when the use is undocumented.

By Emanuel Raymond
Apr 12, 2023, 7:47 AM

Dear all. Of course, the science publishers in particular should be extremely concerned for two basic principles in publishing: responsibility has to be on a human being because of copyright and other legal issues and second related to this is plagiarism (which has been mentioned in the above article). The action taken by WAME is an appropriate one as it made it illegal to author/publish a paper using GPT. Perfect papers in terms of format and grammar but rubbish in terms of content, GPT has a very very long way to go.

By Najeeb Al-Shorbaji
Apr 12, 2023, 8:50 AM

I appreciate the timely article and the tips and resources provided as we enter this new age. I fed GPT-4 a short abstract and a few references on a physics topic and asked GPT to compose a full article based on that input. Then I had GPT-2 Output Detector and GPTZero evaluate the fully generated portions, and both tools reported that the content was fully human-generated. In fact GPT-2 Output Detector scored the fully generated text to be 99.97% real. Then I asked ChatGTP directly whether the text it generated was human- or AI-written. The response was “The text is likely human written,” with a paragraph of rationale as to why.

By Scott Dineen
Apr 12, 2023, 9:36 AM

The recent comments from Alberts et. al. [1] summarize my thoughts quite well. A couple of quotes:

“We wonder whether conferences might soon be flooded with AI-generated abstracts or whether predatory publishers [9] might be catalysed by the ability of ChatGPT to churn out convincing but ultimately unreliable content.”

“Once can imagine a not-too distant future where AI might generate and review research [10], which could then be cited by other AI-generated research or commented upon via an AI-generated letter to the editor. Until recently, such a future might have sounded far-fetched. In light of the astonishing pace with which LLM have been implemented, we feel that the academic nuclear medicine community urgently needs to confront this issue.”

At the moment, I am perhaps more worried about the first thing instead of the issues with quality publishers. These models will presumably push the predatory journal business and the associated fake science to a whole new level. As we saw during the COVID-19 pandemic, these fake science outlets will be likely used to deceive the public with false machine-generated content about issues that are perceived to be controversial, such as, say, the climate change. With things like ChatGPT, you can easily automate an entire empire of fake science and this way “flood the zone” with false information or outright garbage. What is more, such fake science will be subsequently incorporated to the training data of future language models, which will perhaps then cause a self-fulfilling prophecy of garbage and falsehoods.

All knowledge-creation faces similar issues; there are already huge problems at Wikipedia for dealing with the issues.

There are many further issues involved: these models presumably violate authors’ and publishers’ copyrights, they lack attributions to sources they are using, they lack references and even invent fake references to works that do not even exist, they can generate fake data, and so forth and so on.

Personally, I am favor of the position Science took: a complete ban, and if use is detected, a retraction will follow.

As for SciSpace: I did not quite understand from the company’s online documentation the question about copyrights. It seems you have curated a lot of papers from several commercial publishers who forbid commercial use, redistribution, adaptations, and many other things (see e.g. [2]). So it seems you are involved in the same game as others; training your own language model based on other people’s work.

[1] https://link.springer.com/article/10.1007/s00259-023-06172-w
[2] https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license

By Jukka Ruohonen
Apr 22, 2023, 5:08 AM

I’ll add to my previous comment that these AI tools (a.k.a. paper mills) for academic “content creation” are popping out everywhere [1]. The marketing promises include things like creating “well-researched, original, and professional-grade papers in just a few clicks”. These tools also promise to create “plagiarism-free content”. I did not sing-up to any of these, so I cannot evaluate how solid the claims are (expect the claim about plagiarism, which is likely true). But I doubt whether their output is any better than the existing ones; in fact, they are probably integrated to the existing large-scale LLMs.

So at least they’re openly trying to monetize the academia’s publish or perish obsession. Whether they succeed is another things; I myself find it hard to believe that genuine academics would generally have a justified need for these tools.

But I’d expect many will try to submit AI generated manuscripts even to good journals. The same goes for students submitting their theses. I’d also expect further tools to emerge that allow submitting reviewers’ comments, “writing” a response letter, and generating the “corrections”, which will probably be as much nonsense as the original manuscript in question.

But interesting times ahead. If academia allows the scholarly record to be compromised with nonsense or inaccurate machine-generated content, the whole science endeavor will be in big trouble in the long-run even though self-correction will eventually occur. Though, in a sense, I am not that worried because these tools cannot generate anything new.

But as said, the fake science business is another things. As they are already publishing garbage at large 50-100 paper batches [2], I’d expect that with LLMs they will escalate this to publishing batches of thousands or even tens of thousands of nonsense papers scattered to different brands and domains. Such volumes will make any “AI sleuthing” impossible in practice.

Then: if the claim by Alberts et al. that “99% of the internet could be produced by generative AI by 2025” holds, it likely follows that nothing science-related can be trusted in the open Internet. Likewise, all empirical studies based on Internet content will become pointless (unless you are studying disinformation or something similar). Such a scenario opens interesting questions about the business prospects of science publishers. It may be that the large publishing houses will benefit as they are likely to be able to maintain necessary quality controls. The same goes for media and newspapers.

Another things also Scholarly Kitchen might ponder is the effects upon teaching. It seems that the age-old lobbying from ed-tech about “personalized learning” and “personal tutoring” are again part of the marketing claims [3]. As they openly admit the garbage their tools output, these kind of claims are alarming to say the least. I admit that some of the use cases seem sensible, but suggestions like the generation of teaching material seem highly questionable [4]. I also doubt how a suggestion about finding links to online resources will work in practice in the future if the claim about the pollution of the Internet with inaccurate AI content holds. Most likely a teacher would get a link to content sponsored by some large corporation — or, in the worst case scenario, a link to some fake science or disinformation outlet pushed high in the rankings by some bad actor. Some universities even openly admit that the tools can “generate research proposals, summaries, and literature reviews” [5]. In contrast, I personally align with the viewpoints emphasizing the need to teach AI literary and criticism of the systems, as has been emphasized by some teachers [6].

[1] https://buyaidomains.com/mythesis-ai/
[2] https://jamesheathers.medium.com/publication-laundering-95c4888afd21
[3] https://www.businessinsider.com/openai-chatgpt-ceo-sam-altman-responds-school-plagiarism-concerns-bans-2023-1?utm_source=reddit.com&r=US&IR=T
[4] https://www.latinhire.com/8-ways-to-use-chatgpt-to-enhance-your-teaching-experience/
[5] https://www.ohio.edu/center-teaching-learning/resources/chatgpt
[6] https://www.insidehighered.com/news/2023/01/12/academic-experts-offer-advice-chatgpt