360 Company Open-Sources 360 ZhiNao 7B: A 7 Billion Parameter Large Model Capable of Handling 500,000-Character Input Text

2024-04-16

Recently, technology giant 360 announced the open source of its latest developed large-scale model, 360 Brain 7B, on the global code hosting platform GitHub. This move marks another major breakthrough for 360 in the field of artificial intelligence. It is reported that 360 Brain 7B has a staggering 7 billion parameters and brings new possibilities to the field of natural language processing with its powerful computing capabilities and extensive corpus. During the training process, this large-scale model used a corpus of 3.4 trillion tokens, covering multiple languages such as Chinese, English, and code to meet the application needs in different scenarios. It is worth mentioning that 360 Brain 7B offers three versions with different text lengths: 4K, 32K, and 360K. Among them, the 360K version, with a text length of about 500,000 words, has become one of the longest Chinese text length models among domestic open source models. To verify the actual performance of the model, 360 conducted comprehensive testing on multiple mainstream evaluation datasets. These evaluation datasets include C-Eval, AGIEval, MMLU, CMMLU, HellaSwag, MATH, GSM8K, HumanEval, MBPP, BBH, LAMBADA, covering various aspects such as natural language understanding, knowledge reasoning, mathematical calculation, and code generation. After rigorous evaluation, 360 Brain 7B achieved first place in four evaluation datasets and ranked third in overall average score, demonstrating its powerful comprehensive capabilities. In addition, in the LongBench test that focuses on the long text understanding ability of large language models, 360 Brain 7B also performed remarkably well. Especially in tasks closely related to Chinese long text applications, such as Chinese single-document question answering, multi-document question answering, summarization, and few-shot learning, the 360Zhinao-7B-Chat-32K model stood out with the highest average score. Furthermore, 360 also verified the model's long text processing ability in the NeedleInAHaystack test for English texts. In the test, the 360Zhinao-7B-Chat-360K model achieved an accuracy rate of over 98%, demonstrating outstanding text comprehension and processing capabilities. At the same time, 360 constructed a Chinese NeedleInAHaystack test based on the SuperCLUE-200K evaluation benchmark and also achieved an accuracy rate of over 98%, proving the model's advantages in handling Chinese long texts. It is worth mentioning that in addition to the model weights, 360 also open-sourced a complete set of tools including fine-tuning training code and inference code, providing great convenience for developers. This means that developers related to large-scale models can easily access and use these toolkits, achieving the "out-of-the-box" effect and further promoting the development and application of large-scale model technology. Zhou Hongyi, the founder of 360, stated that the development of the large-scale model industry requires continuous breakthroughs and innovations. He emphasized that the competition in the industry regarding text length was only a superficial phenomenon, and the real core lies in the performance and practical application effects of the model. He believes that setting the text length of 360 Brain 7B to 360K was mainly for showmanship, and he emphasized that the power of open source is the key to driving technological development. He calls himself an "advocate of open source" and firmly believes that open source can gather more wisdom and strength to jointly promote the advancement of artificial intelligence technology. The open source of the 360 Brain 7B large-scale model by 360 undoubtedly injects new vitality into the field of Chinese large-scale models. With more companies and developers joining this field, it is believed that there will be more innovative and groundbreaking achievements in the future, promoting the widespread application and development of artificial intelligence technology in various fields.