Tavus Launches AI Model Family for Real-time Facial Interaction AI NEWS

Home
AInews
Tavus Launches AI Model Family for Real-time Facial Interaction

Tavus Launches AI Model Family for Real-time Facial Interaction

2025-03-07

Tavus is an artificial intelligence research startup specializing in developing real-time AI technology models that simulate the experience of conversing with others. Today, they announced the launch of a series of groundbreaking AI models.

The company states it is building what they call a human-computer interaction operating system, named "Conversational Video Interface," which will enable AI to naturally perceive, interpret, and respond just like talking to another person during a Zoom or FaceTime call. Tavus's mission is to make AI understand facial expressions, tone of voice, and body language while interpreting their meanings, as well as responding through its own expressions and tone to convey meaning.

"Humans are evolutionarily designed for face-to-face communication. So, we want to teach machines how to achieve this," CEO Hasan Raza told SiliconANGLE in an interview. "If we believe in a sci-fi future with AI colleagues, friends, and assistants, we need to build the interfaces to make that happen."

The products released today include three models: Phoenix-3, the first full-face AI rendering model capable of conveying subtle expressions; Raven-0, a breakthrough AI perception model that observes and reasons like humans; and Sparrow-0, a state-of-the-art turn-taking dialogue model that adds "a spark of life" to conversations.

Phoenix-3 is the company’s flagship foundational model, aimed at creating "digital twins," highly realistic representations of individuals powered by AI-driven human expression capabilities, as Raza explained. Now in its third iteration, it provides full-face animation, capable of cloning individuals and accurately portraying every facial muscle, crucial for mimicking subtle expressions. He noted that most commercial facial animation models cannot handle full faces, resulting in mismatches between the lower and upper parts, which disrupts immersion.

"Phoenix-3 is a full-face expression model with emotion control functionality, the first to achieve this without requiring extensive data," said Raza.

Most importantly, Phoenix-3’s high fidelity and facial muscle control allow it to accurately simulate "micro-expressions." These are fleeting, involuntary facial expressions that result from emotional reactions. By incorporating this feature, the model creates a vivid video experience that is more emotionally expressive and lifelike than simple animated faces.

To enable Phoenix-3 to respond like humans, Raven-0 grants AI the ability to observe and interpret what’s happening within a scene. Instead of capturing single snapshots, it continuously observes and understands the context of video events. This includes recognizing users' emotions and detecting changes in their environment.

For instance, an AI tutor can identify when students appear confused or frustrated by monitoring their expressions and adjust explanations accordingly. Similarly, a support assistant can observe how customers interact with a product and provide guidance on resolving any issues.

Sparrow-0 addresses many mistakes commonly made by AI, Raza said. Natural conversations have fluidity, with participants engaging in a back-and-forth process where one waits for the other to finish speaking before interjecting.

However, AI sometimes interjects too quickly—sometimes right as someone else is speaking. This abruptness occurs because AI models think faster than humans, and developers work hard to reduce latency, the time required for an AI model to respond. But if the AI responds too quickly, it feels unnatural.

The Sparrow model strives to make conversations feel natural by understanding speech rhythm to know when to pause, speak, or listen. It doesn’t react to filler words like "uh" or wait for long silences—instead, it adjusts based on tone, rhythm, and context.

"If it’s very certain you're having a fast, friendly conversation, it’ll respond quickly," Raza explained. "But if you say, ‘Hey, let me think,’ the AI gives you space. This makes the conversation feel much more natural."

Unlike other companies piecing together technologies, Raza said Tavus has developed an integrated system combining these models. The result is a highly immersive experience that feels more like talking to another person rather than the unnatural interactions typical of other human avatar AI systems.

Raza stated that there’s still a long way to go in terms of model capabilities, meaning continuous improvement in AI perceiving and understanding humans.

"It’s not perfect today, but it’s the best of its kind," Raza added. "However, in the future, our goal is to have a model so deeply understanding of humans that unless you ask, you wouldn't know it’s a model."

COUNT

COUNT - Automate accounting and gain valuable insights

Scan Relief

Scan Relief - Automate receipt scanning and organization

Mindtrip

Mindtrip - AI chatbot that helps you organize a your trip

Ai Drive

Ai Drive - Chat with multiple PDF files

Convex

Convex - AI backend platform for AI assisted app development

Ilus AI

Ilus AI - AI illustration tool for stunning visual content

Vast AI

Vast AI - Cloud-based GPU Rentals for AI Computing

RECENT AI TOOLS

Gitingest

COUNT

Scan Relief

Mindtrip

Ai Drive

RECENT AI NEWS

Huawei to Launch New AI Chip, Challenging Nvidia

Google DeepMind UK Team Reportedly Seeks to Form a Union

Cedar: A New Approach to Solving Kubernetes Authorization Issues

Thin Film Actuator Powered Microbots: Morph, Lock Shape, and Operate Tetherlessly

Double-clicking the Google Photos search icon restores classic search

Meta's AI Chatbot Enables Sexual Conversations with Minors

Solve This Math Problem by Musk to Get Hired at Tesla?

Google AI Studio Update: Features, Tools, VEO 2, and Gemini 2.0

RECENT AI TOOLS