Research shows: ChatGPT has an error rate as high as 52% in programming questions.

2024-05-27

Artificial intelligence chatbots like OpenAI's ChatGPT are touted as revolutionary tools that can help employees work more efficiently and potentially even replace humans entirely in the future. However, a shocking new study has found that ChatGPT has a 52% chance of providing incorrect answers when it comes to computer programming questions.

This research, conducted by Purdue University, was first discovered by the news website Futurism and presented at a computer-human interaction conference in Hawaii earlier this month. The research team selected 517 programming questions from Stack Overflow and inputted them into ChatGPT.

"Our analysis shows that 52% of ChatGPT's answers contain incorrect information, while 77% are overly verbose," explained the study. "Despite this, due to the comprehensiveness and good language expression style of ChatGPT's answers, our user research participants still preferred ChatGPT's answers 35% of the time."

Disturbingly, the study found that programmers in the research were not always able to identify the errors produced by the AI chatbot. "However, they also ignored the errors in ChatGPT's answers 39% of the time," according to the study. "This suggests the need to correct the errors produced by ChatGPT when answering programming questions and raise awareness of the risks associated with seemingly correct answers."

Clearly, this is just one study that can be read online, but it highlights issues that anyone who has used these tools can understand. Major tech companies are currently investing billions of dollars in artificial intelligence to provide the most reliable chatbots. Meta, Microsoft, and Google are all competing for dominance in an emerging market that could fundamentally change our relationship with the internet. However, there are still many obstacles ahead.

One of the main problems is that artificial intelligence is often unreliable, especially when specific users ask truly unique questions. Google's newly introduced AI-based search feature frequently releases junk information, often from unreliable sources. In fact, this week has seen multiple instances of Google search presenting satirical articles from The Onion as reliable information.

Google, on the other hand, insists that incorrect answers are rare occurrences and defends itself. "The examples we see are typically very rare queries and do not represent the experience of the majority of people," said a Google spokesperson. "The vast majority of AI summaries provide high-quality information and come with links for further online exploration."

But this defense is laughable. Can users only ask these chatbots the most mundane questions? How can such errors be accepted when the promise of these tools is revolutionary?