Step Stars Collaborates with Geely to Open Source the Step Series Multimodal Large Models for Video and Audio Fields

2025-02-18

Stairway Stars and Geely Automotive Group have jointly announced that the Step series of multimodal large models, co-developed by both parties, will be open-sourced globally for developers. The open-source models include two versions: the video generation model Step-Video-T2V and the voice interaction model Step-Audio.

Step-Video-T2V is a video generation model with 30 billion parameters, capable of generating high-quality videos at 204 frames and 540P resolution. According to Stairway Stars, this model leads in parameter count and performance among currently available open-source video generation models.

The other open-source model, Step-Audio, is the first product-grade open-source voice interaction model in the industry. It can generate voice expressions with emotion, dialects, languages, singing, and personalized styles based on different scenario requirements, enabling high-quality conversations with users. Moreover, the voices generated by Step-Audio are natural and emotionally intelligent, and it supports high-quality voice replication functionality.

Users can now experience the new features of Step-Audio through the Yuewen App provided by Stairway Stars. This open-source initiative marks a further deepening of cooperation between Stairway Stars and Geely Automotive Group in the field of artificial intelligence, and provides global developers with more possibilities for innovation.