CogVideoX, the AI video generation model, officially open-sourced by Zhipu AI, the same model as "Qingying" AI NEWS

Home
AInews
CogVideoX, the AI video generation model, officially open-sourced by Zhipu AI, the same model as "Qingying"

CogVideoX, the AI video generation model, officially open-sourced by Zhipu AI, the same model as "Qingying"

2024-08-06

Recently, domestic artificial intelligence company Zhupu AI announced a major move, officially open sourcing its self-developed video generation model CogVideoX to developers worldwide. This move aims to further promote the rapid development of video generation technology and expand its application boundaries in the commercial and creative fields. With its cutting-edge large-scale model technology architecture, CogVideoX not only meets the needs of high-end commercial applications but also achieves significant breakthroughs in performance optimization.

Outstanding performance of the open-source version, unlimited creativity with a single card

It is worth noting that the open-source CogVideoX-2B version demonstrated extraordinary performance optimization capabilities. Under FP-16 precision, the model only requires 18GB of VRAM for inference and 40GB of VRAM for fine-tuning. This means that a single NVIDIA RTX 4090 graphics card can easily complete the inference task, and the fine-tuning work can be efficiently done on a single NVIDIA A6000 graphics card. This major breakthrough greatly reduces the technical threshold, enabling more developers and small businesses to easily get started and participate in the innovation and application of video generation technology.

Empowered by 3D VAE, reshaping the benchmark of video generation quality

The core competitiveness of the CogVideoX model lies in its use of 3D Variational Autoencoder (3D VAE) technology. This technology compresses the spatial and temporal dimensions of videos through innovative 3D convolution, achieving unprecedented high compression ratio and excellent reconstruction quality. The model's architecture design is ingenious, including an encoder, a decoder, and a latent space regularizer, which ensures the causal logic of information processing through temporal causal convolution and guarantees the coherence and rationality of the generated video content. In addition, the model integrates expert Transformer technology, which can deeply analyze the encoded video data and combine it with textual input to create high-quality and story-rich video content.

High-quality data-driven approach to solve video generation pain points

In order to train a high-performance CogVideoX model, Zhupu AI has invested a lot of resources in developing an efficient method for selecting high-quality video data. This method effectively eliminates low-quality videos with excessive editing and incoherent motion, ensuring the high standards and purity of the training data. At the same time, the team has innovatively built a pipeline for generating video captions from image captions, cleverly solving the problem of video data lacking detailed textual descriptions and providing a richer and multidimensional source of information for model learning.

Leading performance evaluation, continuous exploration in the future

CogVideoX has demonstrated outstanding performance in multiple key performance evaluation metrics, especially in human motion capture, scene reconstruction, and dynamicity, winning wide recognition in the industry. Meanwhile, Zhupu AI has also introduced evaluation tools focusing on video dynamic characteristics, further refining the evaluation dimensions of the model.

DeepAI

Chat with AI for free

OpenRouter

Access every major AI model trough one platform

MINT AI

AI agents for optimizing advertising campaigns

Toki AI

Toki AI schedules events through messaging apps

Ikko Earbuds

Touchscreen translation assistant for AI earbuds

Action Figure Generator

Create custom collectible action figures made by AI

Spot AI

Transform cameras into smart video intelligence

RECENT AI TOOLS

Rithmm

DeepAI

OpenRouter

MINT AI

Toki AI

RECENT AI NEWS

Reddit Sues Perplexity and AI Data Scraping Companies for Unauthorized Use of Its Data

Google Cloud Launches Nvidia G4 AI Virtual Machines

Multiple Users Report ChatGPT's Impact on Mental Health, Seek Help from FTC

Meta Cuts 600 Jobs in Artificial Intelligence Division

Leena Opens "AI Colleague Studio" for Enterprise Agent Customization

OpenAI Requests List of Participants in ChatGPT Suicide Lawsuit Memorials

Amazon integrates AI with robotics and smart glasses to streamline delivery processes

Amazon Launches AI Smart Glasses for Delivery Drivers

RECENT AI TOOLS