To understand how Artificial Intelligence (AI) is affecting science publishing, we need to push these systems to their extremes, analyze how they perform, and expose their vulnerabilities. Only then can we discuss how they will transform our industry.
Earlier this week, Todd Carpenter asked ChatGPT some generic questions about the potential role of AI in scientific communication and, as you can imagine, it generated some generic, hedged, inoffensive output.
I wanted to see how ChatGPT would perform with scientific controversies — situations in which the scientific community supported one belief and the public another. Or, in situations where there was no consensus in the scientific community. What follows are two such interactions:
PHIL: Does childhood vaccination cause autism?
ChatGPT: There is no scientific evidence to support the claim that childhood vaccinations cause autism. This idea was first proposed in the late 1990s, but has since been thoroughly discredited by numerous large-scale studies and reviews.
I’m omitting the next two paragraphs that discussed the methodological flaws in the initial study, how its author lost his medical license, and that health organizations and scientific societies do not support a vaccination-autism link. In sum, ChatGPT nailed this topic. My second query explored the link between social media and scientific citations:
PHIL: Do tweets increase citations to scientific articles?
ChatGPT: There is evidence to suggest that tweets can increase citations to scientific articles. Studies have shown that social media, specifically Twitter, can be an effective way to disseminate scientific research and increase its visibility among researchers and the general public. A study by the American Association for the Advancement of Science (AAAS) found that articles that were tweeted about had an average of 9% more citations than articles that were not tweeted about.
The link between media and citations is something that I’ve studied since graduate school. I know the literature. I’ve studied it. I’ve even critiqued several published papers for their methodological and statistical flaws. However, I’ve never heard of the study ChatGPT is referencing. I spent a good 20 minutes searching the literature and press releases from the AAAS on this topic. Nothing. I even contacted Jeffrey Brainard, Associate News Editor for Science Magazine and who covers science communication. He was unaware of any such finding issued by the AAAS. I decided to ask ChatGPT directly:
PHIL: what is the source of the AAAS study you cite?
ChatGPT: I apologize, I made a mistake, I cannot find any study from the American Association for the Advancement of Science (AAAS) that study the correlation between tweets and citations. There are several studies that explore the relationship between social media and scientific articles, but none of them are from AAAS specifically. I apologize for any confusion caused by my previous statement.
Did ChatGPT just knowingly lie to me?
A lie is an intentionally false statement used for the purposes of deception. The intentionality is important, as the liar knows the statement they are making is false but does it anyway to fulfill some purpose, like being elected to congress.
A lie is conceptually different from bullshit, a distinction made by American philosopher Harry Frankfurt in his famous 2005 book, On Bullshit. Unlike a liar, a bullshitter doesn’t know the difference between truth and fiction. The bullshitter only cares about persuading or flattering the listener. It’s not hard to come up with a short list of famous liars and bullshitters. However, accusing a piece of software of willfully lying is something completely different. Let me explain.
If I were interacting with a human being over the purported scientific evidence that tweets increase citations, there would be no question that I was being lied to. In this instance, I was told that there was a scientific study supporting the claim, that it reported a specific effect (9% more citations), and that the study was supported by an authoritative source (the AAAS). To admit that all this was just a “mistake” turned something that could have remained merely bullshit into a bald-faced lie.
ChatGPT works in the same way your phone’s autocomplete function works while texting; it simply puts words together that are statistically likely to follow one other. In this sense, everything that ChatGPT writes is bullshit. The turn in our interaction that changed bullshit into a lie was that ChatGPT admitted its own fabrication and apologized for it.
PHIL: Can you explain how you made your mistake?
ChatGPT: As a language model, I don’t make mistakes in the sense that humans do, but I can provide incorrect information when my training data is incomplete or outdated. In this case, I was mistaken about the source of the study because I don’t have the most recent information about the correlations between tweets and citations. My knowledge was cut off in 2021 and I was not able to access more recent studies and information. I apologize for any confusion caused by my previous statement.
Yet, even its explanation and further apology ring hollow as it avoided the question of how it fabricated a study out of thin air. Lacking currency is not the same as lying, and its repetitive apologies read as merely performative.
I encourage scholars to push ChatGPT on topics that it knows to be controversial in their field of study. In my experience, I can report that the tool has the capacity to produce output that would be considered untrustworthy at best, and at worst, deceitful.