APUS Partners with Xindan Intelligence to Open-Source China's First Quadrillion-Level MoE Architecture Large Model
On April 2nd, APUS and its strategic partner, Xindan Intelligence, jointly developed and open-sourced the billion-scale MoE (Mixture of Experts) architecture large model - APUS-xDAN Model 4.0 (MoE), marking a new breakthrough in the field of artificial intelligence in China. This achievement was officially unveiled on GitHub, bringing inclusive and high-quality benefits to the industry.
Xindan Intelligence, although a young company, has an impressive founding team, consisting of elites from top academic and engineering institutions such as Tsinghua University, UC Berkeley, Tencent, and Meta. This team not only includes globally renowned open-source AI community developers but also heavyweight members such as senior Tencent Cloud architects. It is worth mentioning that Xindan Intelligence successfully completed a tens of millions angel round of financing in early March this year, jointly invested by APUS and AI industry veteran investor Zhou Hongyang, injecting strong momentum into the company's future development.
The outstanding performance of APUS-xDAN Model 4.0 (MoE) is also remarkable. The model achieved a comprehensive performance of 90% of GPT-4 on low-end chips such as 4090, providing greater value for the application of large-scale model technology in Chinese enterprises and successfully solving the problem of "computational bottleneck". Against the backdrop of the continuous tightening of US export controls on Chinese semiconductors, this achievement undoubtedly opens up new paths for the widespread application of Chinese AI industry in models.
At the algorithm level, the breakthrough of APUS-xDAN Model 4.0 (MoE) is equally significant. It adopts a MoE architecture similar to GPT-4, realizing the combination of multiple expert models. At the same time, it improves the running efficiency by 200% compared to traditional Dense models of the same size and reduces the inference cost by 400%. Through further high-precision fine-tuning and quantization techniques, the model size is reduced by 500%, becoming the first billion-scale MoE Chinese-English large model that can run on consumer-grade graphics cards.
Through actual testing, the mathematical ability GSM8K of APUS-xDAN Model 4.0 (MoE) reached 79 points, the understanding ability MMLU reached 73 points, and the reasoning ability BBH reached 66 points. Its comprehensive performance surpasses GPT3.5 and approaches GPT4. In terms of mathematical ability, it even surpasses Grok, which was open-sourced by Musk. This achievement not only demonstrates China's strength in the development of ultra-large-scale pre-training models but also injects new momentum into the expansion of the boundaries of artificial intelligence.
Overall, the joint open-source of APUS-xDAN Model 4.0 (MoE) by APUS and Xindan Intelligence has made significant breakthroughs in algorithm optimization and computational inclusiveness, injecting new vitality into the development of China's AI industry. This achievement not only showcases China's international status in AI research strength and technological innovation but also contributes Chinese wisdom and solutions to the development of the global AI field.