Recently, Baichuan Intelligence has officially launched two new AI models: Baichuan-M1-preview and Baichuan-M1-14B. Both models have achieved significant progress in the medical and deep reasoning fields.
Baichuan-M1-preview is China's first full-scenario deep reasoning model, featuring capabilities in language, vision, and search reasoning. In authoritative evaluations such as mathematics and coding, it outperforms similar models like o1-preview. Additionally, this model unlocks a "medical evidence-based mode," providing end-to-end services from evidence retrieval to deep reasoning, answering clinical and research questions in medicine quickly and accurately. Currently, Baichuan-M1-preview has been officially launched on the Baixiao Ying platform, allowing users to experience its abilities in deep reasoning for mathematics, coding, logical reasoning, and medical problem-solving.
In terms of language reasoning, Baichuan-M1-preview performs better than models like o1-preview in mathematical benchmarks such as AIME and Math, as well as coding tasks like LiveCodeBench. In visual reasoning, it surpasses models such as GPT-4o, Claude3.5 Sonnet, and QVQ-72B-Preview in authoritative assessments like MMMU-val and MathVista. Its medical evidence-based mode leverages evidence-based medicine concepts combined with a self-built knowledge base covering hundreds of millions of entries, enabling deep reasoning and optimal medical decision-making for complex medical issues. This knowledge base includes vast amounts of domestic and international medical papers, authoritative guidelines, expert consensus, and more, updated dynamically on a daily basis. The model also features "evidence grading" capabilities, using medical knowledge and evaluation standards to grade evidence at multiple levels, ensuring the accuracy of question-and-answer results.
On the other hand, Baichuan-M1-14B is a smaller version within the Baichuan-M1 series and is the industry's first open-source medical-enhanced large model. Its medical capabilities exceed those of larger-parameter models like Qwen2.5-72B, comparable to o1-mini. This model has been open-sourced on platforms such as GitHub and Huggingface, supporting NPU versions with BF16 inference.
To enhance the medical capabilities of Baichuan-M1-14B, the R&D team conducted extensive optimizations and innovations during data collection, synthetic data generation, and model training stages. In data collection, they gathered trillions of tokens of serious medical data targeting specific medical scenarios, including professional medical papers in Chinese and English, real hospital case records, medical Q&A content, etc. In synthetic data generation, they produced over 100 billion tokens of diverse data, further strengthening the model's medical knowledge and reasoning capabilities. During model training, they employed multi-stage domain improvement schemes combined with ELO reinforcement learning methods to optimize the model's generation quality and logical reasoning capabilities.
Baichuan-M1-14B outperformed larger-parameter models like Qwen2.5-72B-Instruct in authoritative medical knowledge and clinical ability evaluations such as cmexam and clinicalbench_hos, demonstrating its excellent medical capabilities. This open-source initiative aims to promote the innovative development of AI technology in the medical field, enhancing the transparency and credibility of AI medical technology, improving the accessibility of medical services, and thriving the AI medical ecosystem.
The release of these two Baichuan-M1 series models marks another significant advancement in AI applications in the medical and full-scenario deep reasoning fields. Baichuan-M1-preview's full-scenario deep reasoning and medical evidence-based mode, along with Baichuan-M1-14B's open-source and medical enhancement capabilities, will provide powerful support for the continuous progress of the AI medical ecosystem and high-quality medical services.