This week, OpenAI launched advanced voice features for eagerly awaited ChatGPT Plus and Teams users, with enterprise and education editions set to gain access next week. This update signifies a major leap in AI-powered voice interactions, offering ChatGPT users a more natural and responsive conversational experience.
The advanced voice features leverage the GPT-4o, an entirely multimodal model trained to natively comprehend voice elements, distinguishing it from standard voice interactions that rely on separate text-to-speech and speech-to-text models. Advanced voice enables smoother and more context-aware interactions between users and AI, capturing non-verbal cues like speech rate and responding with appropriate emotions.
To initiate an advanced voice conversation, simply click the voice icon at the bottom right corner of the screen. Here are the key features introduced by advanced voice:
- Introduces five new voices alongside existing ones, offering users a total of nine unique personality options: Vale, Spruce, Arbor, Maple, Sol, Breeze, Cove, Ember, and Juniper.
- Enhanced accent recognition capabilities, enabling more accurate communication across various English dialects.
- The system now supports over 50 languages, demonstrating its improved multilingual proficiency.
- Custom instructions and memory functions can also be utilized in voice conversations, facilitating more personalized interactions.
While the new voice mode is highly engaging, please note that its usage limits may fluctuate based on demand. OpenAI has not yet specified daily usage limits for Plus and Team users, but you will receive a notification when there are 15 minutes remaining. Once the limit is reached, users can continue using the standard voice mode.
It is noteworthy that advanced voice features are currently not available in several European countries, including EU member states, the UK, Switzerland, Iceland, Norway, and Liechtenstein.
OpenAI has implemented significant privacy measures for voice interactions. Audio snippets from conversations are stored alongside chat histories and retained as long as the chat history exists. If you delete a chat, the audio will be removed within 30 days unless required by legal or security obligations. If you archive a chat, the audio will be preserved.
The company will not use audio snippets from voice chats to train its models unless you explicitly consent by selecting "Improve Voice for Everyone" in the data controls settings.
Interacting with ChatGPT through advanced voice mode offers a highly natural experience. By integrating multimodal understanding capabilities, the system can provide more contextually appropriate and emotionally rich responses. This development may pave the way for more sophisticated and advanced AI assistant applications across various fields, from customer service to education.
As AI technology continues to advance, voice interfaces like advanced voice mode are likely to become increasingly prevalent in our daily digital interactions, offering a more intuitive and convenient way to engage with artificial intelligence.