Tencent Launches Hunyuan-Large Model, Sets New Benchmark for MoE Models in the Industry AI NEWS

Home
AInews
Tencent Launches Hunyuan-Large Model, Sets New Benchmark for MoE Models in the Industry

Tencent Launches Hunyuan-Large Model, Sets New Benchmark for MoE Models in the Industry

2024-11-05

Today, Tencent officially announced the debut of its newly developed Hunyuan-Large model. According to the company, this model is currently the largest open-source Mixture of Experts (MoE) model in the industry based on the Transformer architecture, boasting a total of 389 billion parameters and an impressive 52 billion active parameters.

To further advance technological development in the field of artificial intelligence, Tencent has open-sourced three versions of Hunyuan-Large on the Hugging Face platform: Hunyuan-A52B-Pretrain, Hunyuan-A52B-Instruct, and Hunyuan-A52B-Instruct-FP8. Additionally, Tencent has released comprehensive technical reports and training and inference operation manuals to assist developers in gaining a deeper understanding of the model's technical features and operational processes.

Technically, the Hunyuan-Large model showcases numerous advantages. Firstly, by employing high-quality synthetic data to enhance training, the model is capable of learning more diverse representation features, effectively handling long-context inputs, and better generalizing to unseen data, thereby improving its generalization ability and robustness.

Secondly, regarding memory usage and computational overhead, Hunyuan-Large utilizes an innovative KV cache compression technique. By introducing Grouped Query Attention (GQA) and Cross-Layer Attention (CLA) strategies, the model significantly reduces the memory footprint and computational costs of KV caching, thereby enhancing inference throughput and efficiency.

Furthermore, to cater to the learning requirements of different expert sub-models, Hunyuan-Large incorporates expert-specific learning rate scaling. This technique assigns varying learning rates to different experts, ensuring that each sub-model can effectively learn from the data and contribute to the overall performance enhancement.

Hunyuan-Large excels in handling extended contexts. The pre-trained model supports text sequences up to 256K tokens, while the Instruct model accommodates sequences up to 128K tokens, providing the model with a significant advantage in managing tasks involving lengthy contextual inputs.

To validate the practical application and security of Hunyuan-Large, Tencent conducted extensive benchmark testing across various languages and tasks. The test results revealed that the model achieved remarkable performance across multiple domains and tasks, showcasing its strong application potential and value.

With the release of the Hunyuan-Large model, Tencent has not only injected new vitality into the artificial intelligence sector but also provided developers with more powerful tools and platforms. As the model continues to be optimized and refined in the future, it is expected to play a significant role in an increasing number of fields and scenarios.

PCR.AI

PCR.AI - Analyze PCR test results with AI

ScrapFly

ScrapFly - Simplified web scraping API for developers

Warp

Warp - AI coding using the terminal

Pixop

Pixop - AI video enhancement and upscaling platform

Swimm

Swimm - Reverse engineer your code

Retell AI

Retell AI - AI voice and chat agents that can make calls and send chat messages

Muset

Muset - The AI-native workspace for deep creators

RECENT AI TOOLS

Kavout

PCR.AI

ScrapFly

Warp

Pixop

RECENT AI NEWS

OpenAI's Non-Profit Parent Company Will Receive Over $100 Billion in Shares from Its Profit-Making Unit

F5 Acquires AI Security Company CalypsoAI for $180 Million

Microsoft Visual Studio 2026 Introduces “AI Integration into Workflows”

NVIDIA Supports QuEra in Expanded $230M Funding Round

FTC Investigates AI Chatbot Companions from Companies like Meta and OpenAI

OpenAI Partners with Oracle on $300 Billion Cloud Computing Agreement to Advance AI Development

Microsoft and OpenAI Continue to Surpass Partnership Boundaries

Arm Launches Lumex Chip Series Optimized for Mobile AI

RECENT AI TOOLS