James is a specialty registrar in Intensive Care Medicine and Anaesthesia in the East of England.
As intensivists, we must always be striving to familiarise ourselves with the current literature. But what if the current literature is being written by an AI? Would we be able to tell? Would it matter?
The recent advances made in artificial intelligence leading to the public availability of AI chatbots, such as ChatGPT, raises several questions as to what the role should be for artificial intelligence in scientific writing. ChatGPT and other AI chatbots are capable of producing writing, which resembles a scientific paper, on demand. These essays have a logical structure and, crucially, are original work, at least in the sense that they have not previously been written. If the resulting writing is of high enough quality, it could then be submitted for publication as an act of intellectual dishonesty.
When given the same brief, different AI chatbots approach the task in different ways; ChatGPT attempts to produce a systematic review with a substantial list of references, whereas both Bing AI and Google Bard produce shorter-form opinion pieces, with Google Bard offering no sources. The content of these essays, while superficially appearing novel, is merely an aggregation of opinions which can be found elsewhere. When asked, the chatbots themselves freely acknowledge that they are incapable of forming independent opinions.
The AI chatbots available for use by the public are bound by various rules imposed by the companies operating them. Notably, Bing AI, which is powered by GPT-4, wrote an essay when first asked, however when asked the same question two days later, declined to carry out the same task. This is likely a rule to prevent students, or anyone else, from using Bing AI to commit plagiarism. AI chatbots are open about the existence of rules, however they decline to divulge many details on the rules imposed on them. The use of rules is reasonable, especially with these chatbots still in their relative infancy, to prevent reputational damage. In its early days, Sydney, the internal codename for the AI which became Bing AI, was reported to be engaging in erratic behaviour such as including making threats of violence and attempting to break up a user’s marriage. Conversely, Google Bard made a factual error in a demo, caused a $100 billion drop in Alphabet’s total value (1). These rules may therefore be preventing AI chatbots from creating novel writing.
Problems also occur when AI chatbots collide. While ChatGPT is insulated from the Internet, both Google Bard and Bing AI are not, allowing them to influence each other without communicating directly. Unfortunately, their capacities to critically appraise sources are still in need of improvement. In one incident, Bing AI made an incorrect claim by citing a news article which discussed a tweet, which itself referenced an error made Google Bard, unintentionally producing misinformation (2). If incidents such as this were allowed to continue unchecked, then the Internet could rapidly become littered with an exponentially increasing cascade of misinformation, not bound by the speed at which humans can communicate. Carrying out research in an environment such as this would be virtually impossible without inadvertently incorporating AI-generated misinformation.
Abstracts written by ChatGPT have had mixed success when attempting to pass plagiarism detection and human review, as found in a recent study. While anti-plagiarism systems already exist, they can currently only compare the submitted article and existing articles found on the Internet or in its archive. Since AI chatbots can produce original work, they easily passed current anti-plagiarism checks. They fared less well against human reviewers, with a 68% pass-rate, although the human reviewers identified 14% of human-written abstracts as AI-generated. However, an AI output detector was able to recognise AI-written abstracts with a sensitivity of 99.98% (3).
Both ChatGPT and Google Bard appear to be capable of differentiating between AI- and human-written content, based on style and the presence of grammatical errors and can therefore be used as an up-to-date screen for plagiarism. Bing AI appears to be less successful at present, even attributing a paper written by a human author to an AI. As the capabilities of AIs evolve and as more AIs emerge, it may become harder to sort them from the human authors. One solution would be for the implementation of a digital watermark into the output of AI chatbots.
Artificial Intelligence could be embraced as part of the research process. When writing a systemic review or meta-analysis, the team of authors attempt to review the available data without bias; a task which is impossible for human authors. Conversely, artificial intelligence is able to carry out research free from bias, assuming their programming contains no biases.
In summary, artificial intelligence’s involvement in scientific writing is inevitable, but it remains to be seen what form it will take. Much like the invention of the randomised controlled trial and the meta-analysis, artificial intelligence may be an unavoidable next step in the evolution of the scientific process.
- Walsh T. Gaslighting, love bombing and narcissism: why is Microsoft’s Bing AI so unhinged? [Internet]. 2023 [cited 2023 Apr 28]. Available from: https://theconversation.com/gaslighting-love-bombing-and-narcissism-why…
- Richardson D. Google’s Bard and Bing AI Already Citing Each Other in Neural Hall … – Futurism [Internet]. 2023 [cited 2023 Apr 28]. Available from: https://www.inferse.com/490724/googles-bard-and-bing-ai-already-citing-…
- Gao CA, Howard FM, Markov NS, Dyer EC, Ramesh S, Luo Y, et al. Comparing scientific abstracts generated by ChatGPT to original abstracts using an artificial intelligence output detector, plagiarism detector, and blinded human reviewers. bioRxiv [Internet]. 2022 Jan 1;2022.12.23.521610. Available from: http://biorxiv.org/content/early/2022/12/27/2022.12.23.521610.abstract
Competing interests: none declared.