BaiChuan Intelligence collaborates with Pengcheng Laboratory to promote innovation and application of domestic computing power large models.

2023-11-17

Bai Chuan Intelligence and Pengcheng Lab Announce Collaboration to Explore Large-scale Model Training and Application, Creating Domestic Computing Power Large-scale Model "Pengcheng-Bai Chuan·Brain 33B" Domestic computing power large-scale models are an important direction in China's artificial intelligence industry, but they also face many challenges, such as imbalanced supply and demand of computing power, incomplete ecological construction, and difficulties in cost control. In order to solve these problems, Bai Chuan Intelligence and Pengcheng Lab announced their collaboration to develop a 128K long-window large-scale model "Pengcheng-Bai Chuan·Brain 33B" based on domestic computing power, and demonstrated the performance and application of this model at the forum. Pengcheng Lab is an important part of China's strategic scientific and technological forces. Adhering to the innovative concept of "domestic computing power + independent large-scale models," it extensively collaborates with enterprises, universities, and research institutes, sharing resources through the cooperation model of open-source collective intelligence, and providing artificial intelligence "wings" for various industries. Bai Chuan Intelligence is a leading enterprise in large-scale models in China, and has been promoting the development of large-scale models and open-source ecological construction since its establishment. Its open-source and closed-source models have achieved excellent results in authoritative evaluations of the same scale. The collaboration between the two parties can fully leverage their respective strengths to better meet the growing demand for intelligent transformation in China and help accelerate the rise of China's artificial intelligence industry. "Pengcheng-Bai Chuan·Brain 33B" is the longest context window large-scale model trained on the "Pengcheng Cloud Brain" domestic computing power platform, and can be upgraded to 192K in the future. The length of the context window is one of the core technologies of large-scale models, which is crucial for understanding and generating text related to specific contexts. Generally speaking, a longer context window can provide richer semantic information, eliminate ambiguity, and make the content generated by the model more accurate and fluent. In order to improve the context window length and overall performance of "Pengcheng-Bai Chuan·Brain 33B," Bai Chuan Intelligence and Pengcheng Lab have optimized the model throughout the entire process. In terms of dataset construction, fine data construction has been adopted to achieve automated data filtering, selection, and matching at the paragraph and sentence levels, greatly improving data quality. In terms of training architecture, self-developed or industry-leading model training optimization techniques such as NormHead, max-Z-Loss, and dynamic-LR have been used to deeply optimize the Transformer module, ensuring stable model convergence while comprehensively improving model optimization efficiency and final results. In addition, in the model toolset throughout the entire lifecycle, a collaboration with Professor Yizhou Wang and Professor Yaodong Yang's team from Peking University has pioneered the RLHF alignment technology with safety constraints, effectively improving the quality and safety of model content generation. The collaboration between Bai Chuan Intelligence and Pengcheng Lab in the development of the "Pengcheng-Bai Chuan·Brain 33B" long-window large-scale model is a breakthrough in the innovation and implementation of domestic computing power large-scale model technology. In the future, Bai Chuan Intelligence will continue to deepen its collaboration with Pengcheng Lab in various dimensions such as technology and computing power, continuously supporting the innovation and development of local large-scale models.