GPT-4 Turbo Regains Title of "Best AI Model" from Claude 3
OpenAI has been making frequent moves recently. Last week, they released the latest GPT-4 Turbo model for developers and paid ChatGPT subscribers. Upon its release, this model received widespread recognition from users due to its numerous improvements compared to its predecessor.
Since Thursday, the updated version of GPT-4 Turbo, known as gpt-4-turbo-2024-04-09, has successfully reclaimed the top position in the Large Model Systems (LMSYS) Chatbot Arena. This is a crowdsourcing platform where users can evaluate large language models (LLMs).
In the Chatbot Arena, users can engage in conversations with two LLMs simultaneously and compare the quality of their responses without knowing the models' identities. Based on their evaluation, users can continue interacting based on their preferences until they determine which model is superior, whether they are evenly matched, or if neither meets their expectations.
These evaluation results will ultimately be used to rank the 82 LLMs in the Chatbot Arena leaderboard, including popular models such as Gemini Pro, Claude 3 series LLMs, and Mistral-Large-2402.
According to the latest update on April 13th, GPT-4 Turbo's updated version maintains its lead in the overall, coding, and English categories in the Chatbot Arena. This means that although Anthropic's Claude 3 Opus briefly surpassed GPT-4 Turbo a month ago, it now ranks second in the overall category, while the older version of GPT-4 Turbo, GPT-4-1106-preview, ranks third.
These outstanding performances may be attributed to the significant improvements in coding, mathematics, logical reasoning, and writing abilities of gpt-4-turbo-2024-04-09. Through a series of benchmark tests, this model has demonstrated exceptional proficiency in evaluating AI models.
If you want to personally compare the performance of gpt-4-turbo-2024-04-09 with other LLMs, you can visit the Chatbot Arena website. Simply click on the "Arena (Side by Side)" option and select the models you wish to compare. However, please note that since you will know the identities of the models in the side-by-side option, you will not be able to participate in voting. If you want to vote and have your opinion counted in the leaderboard, you can use the "Arena (Battle)" option to compare random models.
Of course, if you are not interested in testing and want to directly use gpt-4-turbo-2024-04-09 in ChatGPT, you can become a ChatGPT Plus subscriber for $20 per month.