HKU and Kuaishou Tech Unveil GameFactory: Tackling Game Scene Generalization Challenges AI NEWS

Home
AInews
HKU and Kuaishou Tech Unveil GameFactory: Tackling Game Scene Generalization Challenges

HKU and Kuaishou Tech Unveil GameFactory: Tackling Game Scene Generalization Challenges

2025-01-20

Video diffusion models have emerged as powerful tools for video generation and physical simulations, demonstrating significant potential in the development of game engines. These generative game engines, capable of producing videos with controllable actions responsive to user inputs such as keyboard and mouse interactions, offer users immersive gaming experiences. However, scene generalization - the ability to create new game environments beyond existing ones - remains a critical challenge in this domain.

Although collecting large-scale action-annotated video datasets is the most straightforward approach to achieving scene generalization, the associated annotation costs are prohibitively high, especially for open-domain scenarios. This limitation hinders the development of versatile game engines capable of generating diverse and novel game environments.

To address this challenge, recent research in video generation and game physics has explored various methods, with video diffusion models emerging as a significant advancement. Evolving from U-Net architectures to Transformer-based ones, these models enable the creation of more realistic and longer-duration videos. Techniques like Direct-a-Video provide basic camera controls, while MotionCtrl and CameraCtrl offer more sophisticated camera pose manipulation. Nevertheless, these projects are often constrained to specific games and datasets, limiting their scene generalization capabilities.

Recently, researchers from the University of Hong Kong and Kuaishou Technology introduced GameFactory, an innovative framework designed to tackle the issue of scene generalization in game video generation. By leveraging pre-trained video diffusion models trained on open-domain video data and employing a multi-stage training strategy, GameFactory successfully generates diverse new game environments.

The multi-stage training strategy of GameFactory enables effective scene generalization and action control. The process begins with a pre-trained video diffusion model and proceeds through three stages. In Stage One, the model focuses on the target game domain using LoRA adaptation while retaining most original parameters. Stage Two concentrates on training the action control module by freezing the pre-trained parameters and LoRA to avoid entanglement between style and control. In Stage Three, LoRA weights are removed, allowing the system to generate controlled game videos across varied open-domain scenes.

Evaluations of GameFactory's performance reveal its excellence under different control mechanisms. For discrete control signals like keyboard inputs, cross-attention outperforms concatenation on the Flow-MSE metric, whereas for continuous mouse movement signals, concatenation proves more effective. Regarding style consistency, different methods exhibit comparable performance. The system demonstrates mastery over fundamental atomic actions and complex composite actions across various game scenes, showcasing strong scene generalization and action control capabilities.

GameFactory represents a major step forward in generative game engines, addressing the crucial challenge of scene generalization in game video generation. By effectively utilizing open-domain video data and implementing novel multi-stage training strategies, it illustrates the feasibility of creating new games through generative interactive videos. While this achievement marks a significant milestone, numerous challenges remain in developing fully-fledged generative game engines. GameFactory lays a solid foundation for this evolving field, providing promising directions for future research.

COUNT

COUNT - Automate accounting and gain valuable insights

Scan Relief

Scan Relief - Automate receipt scanning and organization

Mindtrip

Mindtrip - AI chatbot that helps you organize a your trip

Ai Drive

Ai Drive - Chat with multiple PDF files

Convex

Convex - AI backend platform for AI assisted app development

Ilus AI

Ilus AI - AI illustration tool for stunning visual content

Vast AI

Vast AI - Cloud-based GPU Rentals for AI Computing

RECENT AI TOOLS

Gitingest

COUNT

Scan Relief

Mindtrip

Ai Drive

RECENT AI NEWS

Huawei to Launch New AI Chip, Challenging Nvidia

Google DeepMind UK Team Reportedly Seeks to Form a Union

Cedar: A New Approach to Solving Kubernetes Authorization Issues

Thin Film Actuator Powered Microbots: Morph, Lock Shape, and Operate Tetherlessly

Double-clicking the Google Photos search icon restores classic search

Meta's AI Chatbot Enables Sexual Conversations with Minors

Solve This Math Problem by Musk to Get Hired at Tesla?

Google AI Studio Update: Features, Tools, VEO 2, and Gemini 2.0

RECENT AI TOOLS