DeepSeek Releases Janus-Pro Multimodal Model, Entering Text-to-Image Generation Field

2025-01-28

In the early hours of January 28, DeepSeek, a domestic large model research and development company, officially launched the Janus-Pro multimodal large model on the GitHub platform, marking its further expansion into the text-to-image generation field.

According to official statements from DeepSeek, Janus-Pro is an advanced version of the JanusFlow large model released on November 13, 2024. Compared to the previous generation model, Janus-Pro has undergone significant optimizations in several aspects. The training strategy has been improved, the training dataset expanded, and the model size increased. These enhancements collectively contributed to important progress in multimodal understanding and instruction tracking for text-to-image functions. Additionally, the model's stability in text-to-image generation has also been enhanced.

In terms of performance testing, test results published by DeepSeek show that Janus-Pro outperformed Stable Diffusion and OpenAI's DALL-E 3 in both GenEval and DPG-Bench benchmark tests. This outcome highlights Janus-Pro's competitiveness in the text-to-image generation field.

Notably, all four models in the Janus series have now been made open source for developers and researchers to use. This move is expected to promote further advancements in text-to-image generation technology and foster innovation and application in related fields.

DeepSeek's release of the Janus-Pro multimodal large model undoubtedly brings new options and possibilities to the text-to-image generation domain. In the future, this model is expected to play a crucial role in various application scenarios, driving continuous progress in related technologies.