OpenAI launches GPT-4o and ChatGPT desktop version, accelerating language processing and introducing audio functionality

2024-05-14


OpenAI announced last night the launch of its new AI model GPT-4o and the desktop version of ChatGPT, along with an update to the user interface, in a further move to expand the usage of its popular chatbot.

During a live event, Mira Murati, the Technical Director, stated that the new model GPT-4o brings the advanced technology of GPT-4 to all users, including OpenAI's free users. She further pointed out that GPT-4o is "faster" and has improved text, video, and audio processing capabilities. OpenAI revealed that its long-term plan will include allowing users to have video chats with ChatGPT.

Murati emphasized, "This is a significant breakthrough in terms of usability for us."

OpenAI, backed by Microsoft, has been valued at over $80 billion by investors. The company, founded in 2015, is facing pressure to maintain its leading position in the generative AI market while exploring revenue streams to offset its significant investments in processors and infrastructure.

The "o" in GPT-4o stands for "omnipotent." Murati stated that the new model enables ChatGPT to handle 50 different languages with higher speed and quality. It will be available through OpenAI's API, allowing developers to immediately start utilizing the new model to build applications.

She added that GPT-4o is twice as fast as GPT-4 Turbo, while costing only half as much.

The OpenAI team demonstrated the audio capabilities of the new model, such as helping users relax before public speeches. Researcher Mark Chen stated that the model can "perceive the user's emotions" and has the ability to handle user interruptions. The team also asked the model to analyze the user's facial expressions and assess their possible emotional states.

When interacting with ChatGPT in audio mode, it will say, "Hey, what's up? How can I make your day better today?"

The company plans to test the audio mode in the coming weeks, and according to a blog post, ChatGPT Plus subscribers will have early access. OpenAI claims that the new model can respond to user audio prompts in as little as 232 milliseconds, with an average of 320 milliseconds, which is comparable to response times in human conversations.

Chen also demonstrated the model's ability to tell bedtime stories and showcased the effects of changing the tone of voice to make it more dramatic or robotic. He even asked the model to sing the story.

In addition, OpenAI stated that the new model can be used as a translator in audio mode. Chen demonstrated the tool's ability to provide real-time translation between Murati's Italian and English conversations.

The team members also showcased the model's ability to solve math equations and write code, positioning it as a powerful competitor to Microsoft's GitHub Copilot.

This release is one of OpenAI's biggest announcements since the launch of ChatGPT Enterprise in August. Brad Lightcap, the Chief Operating Officer of OpenAI, told CNBC at the time that the development of the tool took "less than a year" and received support from over 20 companies of various sizes and industries.

OpenAI, Microsoft, and Google are currently in the midst of a gold rush in generative AI, as companies across industries are racing to integrate AI-driven chatbots and agents into critical services to avoid being left behind by competitors. Earlier this month, OpenAI's competitor, Anthropic, announced its first enterprise product and a free iPhone app.

According to data from PitchBook, nearly 700 generative AI deals were invested in a record-breaking $29.1 billion in 2023, an increase of over 260% from the previous year. The market is expected to generate revenue of over $1 trillion within the next decade.

However, some in the industry have expressed concerns about new services entering the market without sufficient testing, while scholars and ethicists are concerned about the potential exacerbation of biases by the technology.

Since its launch in November 2022, ChatGPT has broken records and become one of the fastest-growing consumer applications, with approximately 100 million active users per week. OpenAI states that over 92% of Fortune 500 companies are using its platform.

Murati stated during Monday's event that OpenAI aims to "demystify some of the technology." She further revealed, "In the next few weeks, we'll be rolling these features out to everyone."

According to Monday's blog post, the new model will first be rolled out to ChatGPT Plus and Team customers on Tuesday, followed by enterprise users. It will also be made available to ChatGPT's free users starting from Monday, but with usage restrictions. ChatGPT Plus users will have five times the message capacity compared to free users, while ChatGPT Team and Enterprise customers will have even larger usage limits.

At the end of the live event, Murati expressed gratitude to Jensen Huang, the CEO of Nvidia, and his team for providing the necessary graphics processing units (GPUs) to support OpenAI's technology. She stated, "I want to thank the hard work of the OpenAI team and Jensen and the Nvidia team for bringing us the state-of-the-art GPUs that made today's demo possible."