OpenAI to Launch a Multimodal AI Digital Assistant AI NEWS

Home
AInews
OpenAI to Launch a Multimodal AI Digital Assistant

OpenAI to Launch a Multimodal AI Digital Assistant

2024-05-13

OpenAI is reportedly showcasing a new multimodal AI model to its clients that can both engage in conversations and identify objects, according to a recent report by Information magazine. The media outlet cited anonymous sources who claimed that this could be part of the content that the company plans to unveil on Monday. The new model is said to be faster and more accurate in interpreting images and audio compared to OpenAI's existing separate transcription and text-to-speech models. It appears to assist customer service representatives in "better understanding the tone of callers or whether they are being sarcastic," and theoretically, the model could also help students solve math problems or translate real-world signs, as stated by Information magazine. Sources familiar with the matter revealed that the model outperforms GPT-4 Turbo in "answering certain types of questions," but it still has the potential to confidently provide incorrect answers. OpenAI may also be preparing to incorporate new features of ChatGPT for phone calls, according to developer Ananay Arora. Arora shared code screenshots related to phone calls and discovered evidence of servers reserved by OpenAI for real-time audio and video communication. If the upcoming announcement does not involve any of these, then they will not be related to GPT-5. CEO Sam Altman has explicitly denied any connection between the forthcoming announcement and a model that should be "obviously superior" to GPT-4. Information magazine reported that GPT-5 may be publicly released before the end of the year. Altman also stated that the company will not announce a new AI search engine. However, if the reported content is indeed what will be unveiled, it could still overshadow Google's I/O developer conference. Google has been testing AI-powered phone calls, and one of its rumored projects is a multimodal Google Assistant alternative called "Pixie," which can use the device's camera to identify objects and perform tasks such as providing directions to purchase locations or offering instructions. Regardless of what OpenAI plans to announce, it is scheduled to be revealed via live stream on its website on Monday at 10:00 AM Pacific Time / 1:00 PM Eastern Time.

Ikko Earbuds

Touchscreen translation assistant for AI earbuds

Action Figure Generator

Create custom collectible action figures made by AI

Spot AI

Transform cameras into smart video intelligence

Miko

AI interactive learning companion for children

Comet

Smart browser with AI features available for any website

Mirelo AI

AI-generated soundtracks for your video projects

Giskard AI

AI platform for identifying model vulnerabilities

RECENT AI TOOLS

Toki AI

Ikko Earbuds

Action Figure Generator

Spot AI

Miko

RECENT AI NEWS

Intel Launches New Crescent Island GPU, Re-entering the AI Chip Market

You will soon be able to shop at Walmart through ChatGPT

Google Meet Launches AI-Powered Virtual Makeup Feature

Gemini by Google is Now Available to Help You Schedule Google Calendar Meetings

Google Updates Search and Discovery Features with New Expandable Ads and AI Capabilities

Sam Altman Says ChatGPT Will Soon Allow Adult Users to Engage in Explicit Conversations

Oracle Details Upcoming AI Clusters Powered by Nvidia and AMD Chips

Salesforce Launches New OpenAI and Anthropic Integrations

RECENT AI TOOLS