The rapid development of natural language processing (NLP) and large language models (LLM) has led to the emergence of many new dialogue agents tailored to specific needs, capable of answering various queries. These agents range from AI-powered academic support to platforms providing financial, legal, or medical advice.
A research team from Hefei University of Technology and Hefei Comprehensive National Science Center has recently developed an AI-based platform that provides useful psychological support to users, although it is non-professional. They presented this dialogue system named EmoAda at the International Multimedia Modeling Conference held in Amsterdam. Trained to engage in emotional conversations, EmoAda offers low-cost basic psychological support.
"Our paper originated from growing concerns about the increasing prevalence of mental illnesses such as depression and anxiety, especially after the COVID-19 pandemic, and the huge gap between the supply and demand of professional psychological services," said Xiaosun, one of the co-authors of the paper.
"This work builds upon various research efforts, such as Fei-Fei Li's study on measuring the severity of depression through verbal language and facial expressions, Xiaosun's research on multimodal attention networks for personality assessment, and the development of AI-based emotion support systems like Google's LaMDA and OpenAI's ChatGPT."
The primary goal of this research is to create a cost-effective psychological support system that can perceive users' emotions based on different inputs and generate personalized and insightful responses. This system is not intended to replace professional help but rather to alleviate stress and help users enhance their mental flexibility, a characteristic associated with good mental health.
"EmoAda is a multimodal emotional interaction and psychological adaptation system designed to provide psychological support to people lacking mental health services," explained Xiaosun. "It collects real-time multimodal data (audio, video, and text) from users, extracts emotional features, and uses multimodal large language models to analyze these features, enabling real-time emotion recognition, psychological feature analysis, and guidance strategy planning."
The EmoAda platform created by Xiaosun and his colleagues detects users' emotions by analyzing various sensory data, including their voice, facial video, and text. Based on this analysis, the system generates personalized emotional support dialogues and delivers them through text or digital avatars.
Depending on users' needs and mentioned difficulties, the platform may suggest various potentially beneficial activities. Some of these activities are realized through available content on the EmoAda platform, such as guided meditation exercises and music for relaxation or stress relief.
"In real-user testing, EmoAda has proven to provide natural and humane psychological support," said Xiaosun. "In these experiments, we found that some users prefer conversing with AI because it significantly reduces their concerns about privacy breaches and social pressure. Conversing with AI creates a safe, non-judgmental environment where users can express their feelings and concerns without fear of judgment or misunderstanding. AI systems like EmoAda also provide round-the-clock support, without time limitations, which is a huge advantage for users who need help at any time."
In preliminary testing experiments, researchers found that one of the most appreciated aspects of EmoAda is its anonymity. Users often mention feeling comfortable sharing personal information that they find difficult to discuss face-to-face with others.
In the future, this new AI-based system can serve as a basic support service for those who cannot afford professional psychological care or are waiting to access existing mental health services. Additionally, EmoAda can inspire other research teams and pave the way for the development of other AI-based platforms related to mental health.
"Our future research will focus on addressing the limitations of the current system, including optimizing multimodal emotional interaction with large language models to reduce the generation of misleading information, improving model robustness, reducing costs, and integrating a knowledge base of psychology experts to enhance the system's reliability and professionalism," added Xiaosun.