AI that is becoming increasingly realistic is making people start to question whether it has emotions, and even to the point where it can manipulate emotions.
Hume AI's founder and CEO, Alan Cowen, said in a recent interview, "It's just an emotionless AI, but maybe it makes you feel like it has emotions."
Entering the era of empathetic AI
"I think understanding people's emotional reactions is the key to truly learning how to satisfy people's preferences," Cowen said, introducing the world's first empathetic AI, EVI.
Cowen said, "If you're confused, it can clarify for you; if you're excited, it can enhance that excitement; if you're feeling down, it can comfort you."
In most cases, this also relates to user experience and how AI systems interact with their users. "In customer service calls, we can predict whether someone's call is pleasant... Sometimes, based on context, the accuracy of the prediction can be as high as 99%, while based solely on language, the accuracy is about 80%," Cowen added.
Hume AI was founded in 2021 by Cowen, a former researcher at the University of California, Berkeley, and Google. It is a research laboratory and technology company. The company's mission is to ensure that AI serves the goals and emotional well-being of humans.
Cowen believes that voice interfaces will soon become the default way we interact with AI. He said that voice is four times faster than typing; it frees up the eyes and hands; and voice contains more information in tone, rhythm, and timbre.
"That's why we developed the first AI with emotional intelligence to understand the meaning behind speech. Based on your voice, it can better predict when to speak, what to say, and how to say it," he added.
Recently, the company raised $50 million in Series B funding from EQT Group, Union Square Ventures, Nat Friedman, Daniel Gross, Northwell Holdings, Comcast Ventures, LG Technology Ventures, and Metaplanet.
EVI API is finally here!
The company recently released the Emotional Voice Interface (EVI) API, marking the debut of the first emotion-intelligent voice AI API. EVI is now live and can receive real-time audio input and provide generated audio and transcribed text with voice expression metrics.
EVI has launched innovative features based on 100,000 conversations (with an average duration of 10 minutes) and 3 million user messages, including judging the appropriate timing of speech and creating empathetic language with the right tone.
The team stated that EVI can be configured according to customer needs and now has the ability to adjust personality, response style, and voice content. The platform also supports Fireworks Mixtral8x7b, as well as OpenAI and Anthropic models.
In addition, users can connect to their WebSocket to build their own text generation server to determine all EVI messages in the conversation. They can also use EVI's voice by sending text to the API to be read aloud.
Cowen said, "The strength of our AI lies in empowering others through its toolkit. Our API is the key; it allows users to customize their experience and integrate basic tools like web search. This is about achieving customization and promoting collaboration, allowing developers to build on our interface and incorporate user personalization settings."
What will happen next?
Many experts believe that understanding emotion-intelligent AI systems is the future direction. Hume AI is in a perfect position to completely change the way users interact with AI systems.
When it comes to achieving seamless multimodal interaction, Cowen said, "In the future, you will want to have conversations with AI in crowded places, while also wanting it to understand not only your facial expressions but also your tone of voice, so it knows when you're finished speaking and how you feel."
Furthermore, he emphasized the importance of personalization in AI communication tools to make them more adaptive and human-like. This is crucial for applications where AI interacts directly with users, such as customer service, therapy, or educational tools.
"I think personalized voice is very important, as well as personality, and many of these can certainly be achieved through prompts; however, you can't change the underlying accent and sound quality of the voice, so we are also adding more voices," Cowen concluded.