ByteDance's DouBao Large Model 1.5 Pro Released, Outperforming Top Models Like GPT-4o

2025-01-22

Today, ByteDance's DouBao large model has undergone a significant upgrade with the official release of version 1.5 Pro. The newly introduced Doubao-1.5-pro model excels in various critical evaluation benchmarks such as knowledge, coding, reasoning, and Chinese language processing, achieving an overall score that surpasses well-known models like GPT-4o and Claude 3.5 Sonnet, highlighting ByteDance's strong capabilities in artificial intelligence.

Currently, Doubao-1.5-pro is undergoing gray-scale testing within the DouBao App for early access by selected users. Additionally, to facilitate developer integration, this model's API interface is now available on the Volcano Engine platform for direct use.

According to ByteDance's official announcement, DouBao 1.5 Pro utilizes an innovative pre-training strategy, employing efficient training with smaller activation parameters combined with a large-scale sparse MoE architecture, delivering performance equivalent to a Dense model with seven times the activation parameters. This groundbreaking design not only enhances model performance but also significantly improves the leverage efficiency of the MoE architecture, exceeding industry standards by approximately threefold.

Beyond Doubao-1.5-pro, ByteDance has concurrently launched the updated visual understanding model Doubao-1.5-vision-pro and the real-time speech model Doubao-1.5-realtime-voice-pro.

Doubao-1.5-vision-pro has undergone comprehensive upgrades in multimodal data synthesis, dynamic resolution, multimodal alignment, hybrid training, among others, further enhancing its visual reasoning abilities, text document recognition accuracy, fine-grained information comprehension, and instruction-following capabilities. Moreover, the response mode has become more concise and user-friendly, providing a smoother and more natural interaction experience.

The DouBao large model 1.5 Pro also introduces its first real-time speech model, fully accessible in the DouBao App (requires upgrading to version 7.2.0). This model integrates speech understanding and generation functionalities, enabling end-to-end voice dialogue features characterized by low latency and interruptibility. It is understood that Volcano Engine will launch corresponding API services via the Ark platform in the first half of the year, offering a convenient access method for more developers.

Furthermore, ByteDance emphasizes that no data generated by other models was used during the training process of DouBao 1.5 Pro, ensuring the model's independence and originality. All products in the DouBao 1.5 series, including Doubao-1.5-pro, Doubao-1.5-lite, Doubao-1.5-vision-pro, etc., will maintain their original pricing while offering enhanced features, fulfilling the promise of added value without price increases.

The launch of DouBao large model 1.5 Pro not only signifies ByteDance's new breakthroughs in artificial intelligence but also brings smarter and more convenient experiences to a wide range of users. In the future, ByteDance will continue to deepen its involvement in AI, continuously introducing innovative products and technologies to contribute to the development of the industry.