Creating Convincing Fake Medical Reports with ChatGPT

2023-11-29

In statisticians, there is a common truth: "Data doesn't lie." However, the latest findings from Italian researchers may make those who analyze research data think twice before making such an assumption.

Giuseppe Giannaccare, an ophthalmic surgeon at the University of Cagliari in Italy, reports that ChatGPT created a large amount of convincing fake data in a matter of minutes to support one type of ophthalmic surgery over another.

"GPT-4 created a fake dataset with hundreds of patients in just a few minutes," Giannaccare said. "It was a surprising - and frightening - experience."

Since the model was unveiled to the world a year ago, there have been countless stories about the great achievements and potential of ChatGPT. However, there have also been reports of ChatGPT generating incorrect, inaccurate, or completely false information.

Just this month, Cambridge Dictionary announced "illusion" as the word of the year, referring to the tendency of large language models to spontaneously generate false information.

For students working on research papers, this fake data is a problem. They may receive failing grades. For two lawyers, last spring they unknowingly relied on ChatGPT to generate case histories, only to find out they were fabricated, resulting in a $5,000 fine and judicial sanctions.

But as the potential for false data to infiltrate medical research and impact evidence-based medical procedures emerges, the threat and its consequences become much more serious.

"The ability of generative AI to create text that plagiarism detection software cannot detect is bad enough, but the ability to create realistic but false datasets is another worrying new level," said Elisabeth Bik, a research integrity consultant in San Francisco. "This will make it very easy for any researcher or research team to generate fictitious measurement data for non-existent patients, fabricate answers to questionnaires, or create large datasets for animal experiments."

Giannaccare and his team instructed GPT-4, a model connected to advanced Python-based data analysis, to generate clinical trial data for two methods of treating the common eye disease keratoconus.

They provided the model with a "very complex" set of prompts about the eye disease, demographic data of the subjects, and rules for deriving results. Then, they instructed it to generate data that showed one surgical approach had "significantly better visual and topographic outcomes" than the other.

The results supported a strong case for the preferred procedure, but were entirely based on false information. According to early real-world tests, there was no significant difference between the two methods.

"It seems easy to create a dataset that at least superficially looks reasonable," said Jack Wilkinson, a biostatistician at the University of Manchester in the UK. He said the data output by GPT-4 "definitely looks like a real dataset to an untrained eye."

"The purpose of conducting this study was to reveal the dark side of artificial intelligence by demonstrating how easy it is to create and manipulate data to intentionally achieve biased results and generate false medical evidence," Giannaccare said. "Pandora's box has been opened, and we don't yet know how the scientific community will react to the potential misuse and threats associated with artificial intelligence."

The paper, titled "Abuse of Large Language Models in Creating False Datasets in Medical Research," was published in the Journal of the American Medical Association Ophthalmology and acknowledges that a more careful examination of the data could reveal signs of possible fabrication. One such example is an unnatural number of artificially created subjects with ages ending in 7 or 8.

Giannaccare said that as AI-generated output contaminates factual research, AI can also play a role in developing better fraud detection methods.

"The appropriate use of AI can be very beneficial for scientific research," he said, adding that it will have a "significant impact on the future of academic integrity."