ByteDance’s intelligent AI assistant, Doubao, has announced a comprehensive upgrade to its text-to-image capabilities, now allowing users to generate images with specified text through a single click operation. Users simply need to specify their text requirements in the image generation prompt, for example, by entering "An image containing the text 'cutting-edge technology'," and the system will swiftly produce an image that meets these criteria. Currently, this innovative feature is undergoing testing in the Doubao app, and the iDream platform has also initiated limited testing.
According to today's report by "Huxiu," a representative from Doubao's large model team revealed that Doubao's text-to-image model has significantly enhanced its proficiency in learning native Chinese data by integrating LLM (Large Language Models) and DIT (Deep Image Transformation) architectures. Building on this foundation, Doubao has specifically optimized its character generation capabilities, greatly improving the quality of the generated images. It is reported that Doubao's web and desktop versions will also introduce this powerful feature in the near future.
Additionally, at the beginning of this month, Doubao introduced an image understanding feature, enhancing the user experience with more comprehensive functionalities. Users can now find new photo and camera buttons in the Doubao app and Doubao PC version, allowing them to upload images for the system to identify and analyze the content and basic characteristics within the pictures. Doubao’s "Image Understanding" feature not only recognizes the elements in the images but also assists users in answering specific queries regarding the locations of landmarks or the origins of film and television characters depicted in the pictures.