"Alibaba Releases Open-Source Large-Scale Coding Model CodeQwen 1.5, Supporting 92 Programming Languages"

2024-04-17

After achieving a major breakthrough in the field of artificial intelligence, Tongyi Qianwen announced last night the open source of its code model based on Qwen 1.5 - CodeQwen 1.5. This open source initiative marks a significant progress in the field of code intelligence, providing developers with more powerful and intelligent code generation and modification tools. CodeQwen 1.5 is a code expert model based on the Qwen language model, with a parameter scale of up to 7 billion. It adopts advanced GQA architecture and has processed approximately 3 trillion tokens of code data through pre-training. This model not only supports 92 programming languages but also handles context inputs of up to 64K, providing unprecedented convenience for developers. In terms of performance, CodeQwen 1.5 demonstrates outstanding capabilities in code generation, long sequence modeling, code modification, and SQL. According to relevant evaluation data, CodeQwen 1.5's performance in code generation has surpassed many large-scale models, narrowing the gap in coding ability with top models like GPT-4. It is worth mentioning that CodeQwen 1.5 has also demonstrated strong generalization capabilities in multiple competition platforms. In the evaluation of LiveCodeBench, CodeQwen 1.5 has achieved significant results in problem-solving on platforms such as LeetCode, AtCoder, and CodeForces, despite including LeetCode data in its pre-training corpus. Furthermore, CodeQwen 1.5 is not only proficient in Python but also supports multiple mainstream programming languages. In the evaluation of MultiPL-E, CodeQwen 1.5's performance in eight mainstream languages is remarkable, proving its excellent multi-language programming capabilities. In practical applications, CodeQwen 1.5 has also shown outstanding performance. In the testing on SWE Bench, it can understand code repositories and generate testable code, providing strong support for solving real-world software development problems. At the same time, CodeQwen 1.5 has also demonstrated the best effects in code modification, especially in Debug, Translate, Switch, Polish, and other aspects. As an intelligent SQL expert, CodeQwen 1.5 also has the ability to query databases through natural language, providing a convenient way for non-programming professionals to interact with data efficiently. In benchmark tests such as Spider and Bird, CodeQwen 1.5's performance is close to that of GPT-4, fully demonstrating its strength in the SQL field. The open-sourced CodeQwen 1.5 model has been supported by various platforms and tools such as Transformers, vLLM, llama.cpp, and Ollama, providing a wider range of application choices for the open-source community. The open-source community has high expectations for the release of CodeQwen 1.5, hoping that it will make more contributions to the community in code assistants, Code Agents, and play a more important role in the future development of code intelligence, realizing the true dream of AI programmers. Through the open source of the CodeQwen 1.5 model, Tongyi Qianwen has not only improved the efficiency of developers' work and simplified the software development process but also injected new vitality into the entire open-source community. With the continuous development and improvement of this technology, it is believed that the field of code intelligence will witness more innovation and breakthroughs in the future.