Anthropic launches Vulnerability Rewards Program to Enhance AI Security Measures

2024-08-12

The company behind the Claude chatbot, artificial intelligence company Anthropic, is launching a new initiative to strengthen its AI security measures. The AI research lab is expanding its bug bounty program, offering rewards of up to $15,000 to identify general jailbreak vulnerabilities in its upcoming security system.

The program aims to uncover vulnerabilities that could potentially bypass AI security barriers in a range of high-risk areas, including chemistry, biology, radiation, nuclear energy, and cybersecurity. This is one of the measures taken by Anthropic to enhance the defense of its AI models and prevent potential misuse.

Mike Sellitto, Global Affairs Lead at Anthropic, emphasizes the complexity of ensuring the security of AI systems. "The attack surface is, to some extent, infinite. Without safeguards, you can put anything as input into the model, and the model can generate virtually any output." This highlights the importance of the new initiative, which focuses on repeatable and widely existing vulnerabilities rather than isolated incidents. General jailbreak vulnerabilities are particularly concerning as they can undermine AI security measures in multiple scenarios, leading to severe and dangerous misuse of AI technology.

The expanded bug bounty program will initially collaborate with HackerOne and operate on an invitation-only basis (though the company plans to involve more participants in the future). Participants will have the opportunity to access and test Anthropic's latest security mitigation system, which has not yet been publicly released.

This initiative by Anthropic aligns with the commitment of other AI companies to develop responsible AI, including the voluntary AI commitment announced by the White House and the organization behavior guidelines for developing advanced AI systems established by the G7.

Experienced AI security researchers and individuals skilled in identifying language model jailbreak vulnerabilities can apply for an invitation letter through Anthropic's application form by August 16. The company plans to notify selected applicants in the fall and intends to expand the program more widely in the future.

As AI capabilities continue to rapidly advance, Anthropic's expanded bug bounty program represents an important effort to ensure that security measures keep pace with technological progress.