"Qwen1.5-110B: An Open-Source Parameter Model for Answering Thousands of Questions, Significantly Improving Basic Abilities and Chat Evaluation"

2024-04-28

Recently, the Qwen team announced the open-source release of their first trillion-parameter model, Qwen1.5-110B. This model has demonstrated outstanding performance in basic capabilities and Chat evaluation, showing significant improvements compared to similar models. This breakthrough marks another major milestone for the Qwen team in the field of artificial intelligence. It is understood that Qwen1.5-110B adopts an advanced Transformer decoder architecture, supporting multiple languages and introducing an efficient grouped query attention mechanism. In Chat evaluation, this model has shown even better performance compared to previous versions, showcasing the potential of larger-scale models in dialogue generation. It is worth mentioning that the performance improvement of the Qwen1.5-110B model is mainly attributed to the expansion of model size, while the training method has not undergone significant changes. This fully demonstrates the importance of model size expansion in enhancing performance. Even with the training method remaining unchanged, significant improvements in model effectiveness can be achieved by increasing the number of model parameters. As the largest model in the Qwen series, Qwen1.5-110B has over 100 billion parameters, making it the team's first model to surpass the trillion-parameter milestone. When compared to recently released SOTA models, Qwen1.5-110B also performs exceptionally well, indicating significant room for improvement in model size expansion. Looking ahead, the Qwen team will continue to explore the advantages brought by the expansion of model size and the scaling of pre-training data, in order to drive continuous advancements in artificial intelligence technology. With the continuous development of technology, we have reason to believe that Qwen1.5-110B will become an important cornerstone in the field of artificial intelligence, bringing more convenience and innovation to people's lives.