Recently, the highly anticipated EMO model has finally been launched on the Tongyi APP and is now available for free to all users. This AI model, with its unique feature of singing in photos, allows users to easily make the characters in the pictures sing or speak, attracting widespread attention.
It is reported that users only need to input an audio clip or a photo into the EMO model to make Audrey Hepburn sing "Shang Chunshan," a terracotta warrior woman speak English rap, or even have Einstein tell jokes in Chinese. This new AI-generated video feature brings users a whole new creative experience.
In the Tongyi APP, users can find the EMO product page "Quanmin Changyan" (Everyone's Singing and Acting) in the "Quanmin Stage" channel. Here, users can choose from a variety of templates, including songs, popular phrases, and emojis, and upload their own portrait photos. The EMO model will then quickly synthesize a lively and interesting video based on the selected template and photo.
Currently, the Tongyi APP has launched more than 80 EMO templates, including popular songs such as "Shang Chunshan" and "Wild Wolf Disco," as well as internet phrases like "Bowl Chicken" and "Hand Digging." This provides users with a wide range of choices. However, the Tongyi APP has not yet opened the function for users to customize their own audio clips, so users can only choose from the pre-set audio clips in the app to generate videos.
The EMO model is an AI model developed by the Tongyi Laboratory, and its underlying Talking Head technology is a hot topic in the current AIGC field. Compared to previous Talking Head technologies, the EMO model adopts a weak control design, which does not require 3D modeling of the face, head, or body parts to drive the portrait's mouth movements. This innovation not only reduces the cost of video generation but also greatly improves the quality of the generated videos.
In addition, the EMO model also has the ability to learn and encode human emotions. It can accurately match the audio content with the character's expressions and mouth movements, and reflect the tone and emotional color of the audio onto the character's micro-expressions, making the generated videos more vivid and realistic.
In late February of this year, the Tongyi Laboratory released a research paper on the EMO model, making it one of the most attention-grabbing AI models after SORA. Now, the launch of the Tongyi APP allows everyone to experience the creativity of this cutting-edge model for free.
In the future, with the continuous development and improvement of EMO technology, it is expected to be widely applied in various fields such as digital humans, digital education, film and television production, virtual companionship, and live e-commerce, bringing more innovation and convenience to users.