Alibaba Cloud Open Sources Qwen2 Large-scale Model with Context Window up to 128K

2024-06-07

Alibaba Cloud announced a major upgrade to its Qwen series models, introducing the new Qwen2 series, marking another important breakthrough in the field of AI large-scale models. This upgrade not only brings a variety of choices in model size, but also enhances language support, improves performance, and significantly expands support for context length. The Qwen2 series models cover five different sizes of pre-training and fine-tuning models, ranging from Qwen2-0.5B to Qwen2-72B, to meet the needs of developers in different scenarios. It is worth mentioning that Qwen2 has added the ability to process high-quality data in 27 languages, significantly improving the model's multilingual processing capabilities on top of its existing Chinese and English language support. In terms of performance, the Qwen2 series models have performed exceptionally well on multiple benchmark tests, particularly in code and mathematical abilities. This achievement is attributed to the integration of Alibaba Cloud's successful experience with CodeQwen1.5 and the use of large-scale and high-quality data for training. Additionally, through technologies such as YARN or Dual Chunk Attention, the Qwen2 model has demonstrated outstanding performance in handling long-text tasks, further expanding its ability to process long contexts. In terms of security performance, the Qwen2-72B-Instruct model has shown security performance comparable to GPT-4 in tests for unsafe queries in multiple languages. This achievement showcases Alibaba Cloud's efforts in ensuring model security. The performance of the Qwen2 series models has improved on multiple evaluation datasets, particularly in code, mathematics, and multilingual understanding, demonstrating strong performance and application potential. In addition to technological breakthroughs, the Qwen2 series models also emphasize openness and collaboration. Currently, the Qwen2 series models have been open-sourced on platforms such as Hugging Face and ModelScope, providing developers with more abundant and flexible tool choices. Alibaba Cloud also provides comprehensive community support, including tools and frameworks for fine-tuning, quantization, deployment, local execution, and evaluation, to help developers better apply and optimize models. The release of the Qwen2 series not only showcases Alibaba Cloud's technical strength and innovation in the field of AI large-scale models but also provides more powerful and flexible tool choices for developers worldwide.