Zixiang Future Releases New Version of Multimodal Generation and Understanding Model AI NEWS

Home
AInews
Zixiang Future Releases New Version of Multimodal Generation and Understanding Model

Zixiang Future Releases New Version of Multimodal Generation and Understanding Model

2025-01-03

Recently, ZhiXiang Future announced the launch of ZhiXiang Multimodal Generation Model 3.0 and ZhiXiang Multimodal Understanding Model 1.0.

It is reported that ZhiXiang's multimodal large models have established the largest multimodal copyright corpus in China, comprising tens of thousands of hours of copyrighted video materials and thousands of authorized IPs, covering over 70% of Chinese-language film and television data. These resources have generated hundreds of millions of AIGC secondary creation materials and have been widely applied in various fields such as film and television, tourism, communication, marketing, and education.

In terms of technology, ZhiXiang Multimodal Generation Model 3.0 has significantly enhanced its image and video generation capabilities. The new version improves picture quality and relevance, enhances the controllability of camera and scene movements, and optimizes multi-scenario driving. Additionally, it innovatively combines autoregressive and diffusion models to create a globally pioneering diffusion-autoregressive model architecture, effectively reducing model size and computational costs while achieving a dual improvement in performance and efficiency. The new version also introduces a mixed imaging model MOE architecture, ensuring high generation quality while significantly accelerating inference speed, providing technical support for real-time or near-real-time applications.

At the same time, ZhiXiang Multimodal Understanding Model 1.0 has officially debuted. This version achieves fine and accurate understanding of image and video content through object-level and event-level spatiotemporal modeling. During the on-site demonstration at the pilot zone launch ceremony, ZhiXiang Multimodal Understanding Model 1.0 successfully provided detailed descriptions of video scenes, capturing complex relationships, logical sequences, spatial arrangements, and camera movements among objects in the frames.

Furthermore, ZhiXiang Future Technology showcased an innovative "one-stop video platform." This platform allows users to upload personal photos to create new interactive experiences and demonstrated personalized interactive presentations of Anhui cultural relics. This practice not only enhances the appeal of the content but also provides a unique perspective for promoting Anhui's cultural tourism.

The release of the new versions of ZhiXiang's multimodal large models marks a significant step forward in the company's technological innovation and application expansion in the field of artificial intelligence. It also injects new vitality into the creative industry and visual arts.

Scan Relief

Scan Relief - Automate receipt scanning and organization

Mindtrip

Mindtrip - AI chatbot that helps you organize a your trip

Ai Drive

Ai Drive - Chat with multiple PDF files

Convex

Convex - AI backend platform for AI assisted app development

Ilus AI

Ilus AI - AI illustration tool for stunning visual content

Vast AI

Vast AI - Cloud-based GPU Rentals for AI Computing

Amazon Nova Act

Amazon Nova Act - Error retrieving information

RECENT AI TOOLS

COUNT

Scan Relief

Mindtrip

Ai Drive

Convex

RECENT AI NEWS

Google's New Strategy: Extending Third-Party Cookies Gives Users More Choice

Former OpenAI Employees and AI Experts Urge the Attorney General to Halt Profit-Making Conversions

WhatsApp Launches "Advanced Chat Privacy": Prevents Chat Export and Automatic Media Downloads

Nvidia Announces Full Launch of NeMo Tools for Building AI Agents

Character.AI Launches AvatarFX to Transform Images into Lifelike Chatbots

OpenAI GPT-4.1: Less Stable Than Previous AI Models

Developers Can Now Access OpenAI's Image Models via API

Docker Model Runner Aims to Simplify Local LLM Model Execution

RECENT AI TOOLS