OpenAI launches CriticGPT, a model for identifying code errors in GPT outputs

2024-06-28

OpenAI recently announced the launch of an innovative tool called CriticGPT, aimed at significantly improving the efficiency and accuracy of code review through artificial intelligence technology. This tool identifies and corrects errors and mistakes in the output of AI models, ensuring that AI systems can execute developer intentions more accurately.


In traditional code review processes, developers typically rely on manual evaluation and correction of the output of large language models. However, OpenAI researchers believe that leveraging the capabilities of AI itself to assist in this process would be more efficient. To achieve this, they developed CriticGPT, an AI review tool built on the GPT-4 large language model.


CriticGPT possesses excellent code analysis and error identification capabilities, assisting human reviewers in completing code review tasks generated by ChatGPT. During testing, CriticGPT's performance was remarkable, with its error identification ability even surpassing that of ordinary human code reviewers. In 63% of cases, human trainers were more inclined to accept CriticGPT's criticism rather than human-written criticism.


To achieve more efficient code review, OpenAI also developed the "Force Sampling Beam Search" technique. This technique enables CriticGPT to provide more detailed comments on AI-generated code and gives human teachers greater flexibility to adjust the thoroughness of CriticGPT when searching for errors. Additionally, this technique effectively controls occasional illusions or false positives generated by CriticGPT, ensuring the accuracy of code review.

To validate the practicality of CriticGPT, researchers applied it to the training dataset of ChatGPT. These datasets were annotated by human reviewers as "flawless," yet CriticGPT still identified 24% of errors. These errors were subsequently confirmed by human reviewers, demonstrating CriticGPT's ability to identify subtle errors that humans may overlook.

Although CriticGPT has made significant progress in code review, OpenAI acknowledges that it still faces some challenges. Firstly, since CriticGPT is trained on shorter responses from ChatGPT, its performance may be limited when handling longer and more complex tasks. Additionally, CriticGPT is not infallible and may still fail to detect all errors and occasionally generate illusions or false positives.

However, OpenAI is confident in the future development of CriticGPT. They plan to integrate CriticGPT into their reinforcement learning from human feedback (RLHF) process to further improve the efficiency and accuracy of code review. This means that OpenAI's human trainers will be able to utilize CriticGPT as a powerful generative AI assistant to assist them in reviewing AI outputs, thereby driving continuous advancements in artificial intelligence technology.