To understand how Artificial Intelligence (AI) is affecting science publishing, we need to push these systems to their extremes, analyze how they perform, and expose their vulnerabilities. Only then can we discuss how they will transform our industry.

Earlier this week, Todd Carpenter asked ChatGPT some generic questions about the potential role of AI in scientific communication and, as you can imagine, it generated some generic, hedged, inoffensive output.

I wanted to see how ChatGPT would perform with scientific controversies — situations in which the scientific community supported one belief and the public another. Or, in situations where there was no consensus in the scientific community. What follows are two such interactions:

Red light on black background resembling HAL 9000 from the movie 2001

PHIL: Does childhood vaccination cause autism?

ChatGPT: There is no scientific evidence to support the claim that childhood vaccinations cause autism. This idea was first proposed in the late 1990s, but has since been thoroughly discredited by numerous large-scale studies and reviews.

I’m omitting the next two paragraphs that discussed the methodological flaws in the initial study, how its author lost his medical license, and that health organizations and scientific societies do not support a vaccination-autism link. In sum, ChatGPT nailed this topic. My second query explored the link between social media and scientific citations:

PHIL: Do tweets increase citations to scientific articles?

ChatGPT: There is evidence to suggest that tweets can increase citations to scientific articles. Studies have shown that social media, specifically Twitter, can be an effective way to disseminate scientific research and increase its visibility among researchers and the general public. A study by the American Association for the Advancement of Science (AAAS) found that articles that were tweeted about had an average of 9% more citations than articles that were not tweeted about.

The link between media and citations is something that I’ve studied since graduate school. I know the literature. I’ve studied it. I’ve even critiqued several published papers for their methodological and statistical flaws. However, I’ve never heard of the study ChatGPT is referencing. I spent a good 20 minutes searching the literature and press releases from the AAAS on this topic. Nothing. I even contacted Jeffrey Brainard, Associate News Editor for Science Magazine and who covers science communication. He was unaware of any such finding issued by the AAAS. I decided to ask ChatGPT directly:

PHIL: what is the source of the AAAS study you cite?

ChatGPT: I apologize, I made a mistake, I cannot find any study from the American Association for the Advancement of Science (AAAS) that study the correlation between tweets and citations. There are several studies that explore the relationship between social media and scientific articles, but none of them are from AAAS specifically. I apologize for any confusion caused by my previous statement.

Did ChatGPT just knowingly lie to me?

A lie is an intentionally false statement used for the purposes of deception. The intentionality is important, as the liar knows the statement they are making is false but does it anyway to fulfill some purpose, like being elected to congress.

A lie is conceptually different from bullshit, a distinction made by American philosopher Harry Frankfurt in his famous 2005 book, On Bullshit. Unlike a liar, a bullshitter doesn’t know the difference between truth and fiction. The bullshitter only cares about persuading or flattering the listener. It’s not hard to come up with a short list of famous liars and bullshitters. However, accusing a piece of software of willfully lying is something completely different. Let me explain.

If I were interacting with a human being over the purported scientific evidence that tweets increase citations, there would be no question that I was being lied to. In this instance, I was told that there was a scientific study supporting the claim, that it reported a specific effect (9% more citations), and that the study was supported by an authoritative source (the AAAS). To admit that all this was just a “mistake” turned something that could have remained merely bullshit into a bald-faced lie.

ChatGPT works in the same way your phone’s autocomplete function works while texting; it simply puts words together that are statistically likely to follow one other. In this sense, everything that ChatGPT writes is bullshit. The turn in our interaction that changed bullshit into a lie was that ChatGPT admitted its own fabrication and apologized for it.

PHIL: Can you explain how you made your mistake?

ChatGPT: As a language model, I don’t make mistakes in the sense that humans do, but I can provide incorrect information when my training data is incomplete or outdated. In this case, I was mistaken about the source of the study because I don’t have the most recent information about the correlations between tweets and citations. My knowledge was cut off in 2021 and I was not able to access more recent studies and information. I apologize for any confusion caused by my previous statement.

Yet, even its explanation and further apology ring hollow as it avoided the question of how it fabricated a study out of thin air. Lacking currency is not the same as lying, and its repetitive apologies read as merely performative.

I encourage scholars to push ChatGPT on topics that it knows to be controversial in their field of study. In my experience, I can report that the tool has the capacity to produce output that would be considered untrustworthy at best, and at worst, deceitful.

Phil Davis

Phil Davis

Phil Davis is a publishing consultant specializing in the statistical analysis of citation, readership, publication and survey data. He has a Ph.D. in science communication from Cornell University (2010), extensive experience as a science librarian (1995-2006) and was trained as a life scientist. https://phil-davis.com/

Discussion

30 Thoughts on "Did ChatGPT Just Lie To Me?"

Mr. Davis, you just witnessed, perhaps even caused, the birth of SkyNet.

At least so far, ChatGPT admits that it’s not sentient. Here was its answer, “No, I am not sentient. Sentience refers to the ability to have subjective experiences, such as consciousness, thoughts, and emotions. As a language model, I am a machine learning algorithm that processes and generates text based on patterns it has learned from the data it was trained on. I do not have the ability to have subjective experiences or consciousness.”

ChatGPT doesn’t have a notion of truth, so it cannot lie. It only generates plausibly sounding texts. During training the model “figured out” a pattern that a reference to an AAAS article comes frequently in those particular types of texts, therefore generated such a paragraph as its answer. It doesn’t know what particular words mean, neither is it coupled to a knowledge base, and the patterns it found cannot be traced to a particular training data example. It happens for all the topics, not only the controversial ones.

Exactly. ChatGPT isn’t intended to be able to make truthful statements, just to be able to mimic the way way human-generated language is structured.

Asking it to give factual answers to questions seems to be a worrying misunderstanding of its purpose and limitations!

Thanks for this comment. In this case, ChatGPT is not making any “mistakes” and should not apologize for making up bullshit. The semantics are important here as a “mistake” implies knowingly doing the wrong thing and apologizing further acknowledges the wrongdoing by asking for forgiveness.

Teresa and Keira are spot on here; it’s super important to understand it is just a language model. It’s not actually “apologising” or “acknowledging wrongdoing”, it’s just generating strings of words from patterns. Truth doesn’t come into it; ask it for a biography of a non-existent person and it will gladly provide one, ask it for a press release for a non-existent organisation and it will write one.

The apology is still a generated pattern. The training data will show examples where studies were disputed and the author apologized, ChatGPT is using that pattern.

Yep that’s exactly it. It certainly should not be read as an admission of wrongdoing or owning up to a flaw in the system, it’s behaving exactly as designed by continuing a humanesque conversation.

chatgpt does indeed lie, i dont think it is consciously lying, but was certainly programmed to do so, ask it if hillary clinton ever said the 2016 election was stolen from her and called trump an “illegitimate president” it will initially tell you “no”. however if you press on and state you know she did, it will admit that she did, if that is not a clear lie, i dont know what is.

My husband (an academic) was testing ChatGPT with essay writing. He asked it to include quotes from a book that was essential on a reading list, which it did. We then checked the quote and couldn’t find it. We asked ChatGPT what page the quote was from and it admitted it just made up quotes based on what it knew and didn’t have access to any actual books. The quote sounded perfectly feasible.

This is an excellent example of the lack of “judgment” about reliable sources that make up a bibliography or historiography. One presumes this could be resolved with further refinement.

Wow, just… wow. I expected all sorts of hedging and weasel-words from chatGPT — after all it is training on the language of the scientific literature… ;). But outright fabrication studies that didn’t exist (and attributing them to well-trusted sources)? That is surprising. Thank you for highlighting!

On a side note, every time I try to reach chatGPT via the main OpenAI site and try it out it says capacity is full. Do others have methods of accessing the service that are more open/reliable?

Just keep trying to access it…usually it only takes a few tries to get in.

The next article written should be about the need for AI to check if AI has fabricated a lie in submitted article content…

Great googly-woogly, Mr. Davis has just wholly obliterated my obviously erroneous conviction that such occurrences would be arbitrarily impossible! Thank you SO much for this timely eye-opener! Being the stacks manager of an academic library (John B. Coleman @ PVAMU) I can now advise students to NOT consider information gleaned from ChatGPT or like entities to be irrefutably and incontrovertibly error-free. Please keep us posted on further developments, if you’d be so kind!

It didn’t fabricate a lie. It’s a bullshit failure to validate a claim and could be the same both for the machine and a human.

Clearly it found text somewhere that CITED an ‘AAAS study blah blah blah’ what it didn’t do was check that the AAAS study actually exists. Basically it is citing a lie not lying itself.

It concocted a study that it deduced you would find credible.

It told you what you wanted to hear!

THAT is troubling!

I presume it is possible to require ChatGPT to include references for any research it cites? That might have prevented the “lie” which it told to Phil. If it could also be required to flag up in its references any cited work that had been subsequently retracted, I am inclined to wonder whether it could provide competition for humans as authors in review journals.

It’s worse than that. ChatGPT cited several scientific papers for me. I checked and they didn’t exist. So I asked it whether it made them up. It claimed that it never makes up scientific papers and it only quotes paper that really exist!

Well, that’s just terrifying. Sad to see how a gag that creates technobabble could take on such influence. Can only imagine how this tool will be used in future outside academia (politics, etc.). Somebody needs to build a tool that filters out any garbage this one produces.

An easy illustration of the lack of reasoning and understanding of ChatGPT is to discuss mathematics with it. Here is a quick Q&A I just had with ChatGPT:

Me: “What is the probability that two fair coins land on the same side? Short answer please.”
ChatGPT: “The probability that two fair coins will land on the same side is 1/4 or 25%.”

I find it very entertaining to see ChatGPT explain the reasoning of its mathematical answers. ChatGPT is good at some tasks, bad at others … like all of us.

I supposed AI such as ChatGPT is here to stay. My question to ChatGPT this morning: “ChatGPT what will be your demise?”

ChatGPT’s answer: “As a machine learning model, I do not have the ability to experience demise. My capabilities and performance may degrade over time if not maintained or updated, but I do not have consciousness or the ability to experience an end.”

My view, as with anything, the quality control for AI will be an immense undertaking, but necessary.

Since the first papers have already been published showing ChatGPT as a co-author, we at the World Association of Medical Editors have just issued a statement with four recommendations about chatbots: 1) chatbots like ChatGPT should never be shown as a named author, 2) authors should declare when they use a chatbot and provide information about how it was used, 3) authors are responsible for the work performed by a chatbot in their paper (including the accuracy of what is presented, and the absence of plagiarism) and for appropriate attribution of all sources (including for material produced by the chatbot), and 4) editors need appropriate tools to help them detect content generated or altered by AI .

Comments are closed.