"Updated AI Model in Ideogram: New Features and Enhanced Performance"

2024-04-15

Just over a month after launching its most powerful text-to-image model, Ideogram has introduced updates that add several new features to the AI, including description-based prompts and negative prompts. These features are available on Ideogram's web platform, aiming to give users more control over image generation and improve the overall quality and coherence of the output. This further strengthens the service's performance in the field of image generation, putting it on par with competitors like Midjourney and DALL-E. Users can now experience these new features, although not all of them are available to free users.

What are the new changes in Ideogram?

When Ideogram released version 1.0 of its model in February, users already experienced the magic of prompt engineering, which expands and refines the details of user input. Now, building on this work, the company has introduced a new "description" feature that generates descriptions or captions based on reference images.

In short, users can now not only obtain publicly generated images from Ideogram but also upload their own images and generate text-based descriptions for them. These descriptions can then serve as prompts to generate highly similar images. If desired, users can also modify the generated descriptions to adjust the output according to their needs.

But the surprises don't end there.

In addition to adding descriptions to reference images, Ideogram has also introduced negative prompts and provided options on the platform to choose between "quick," "default," or "quality" modes. As the names suggest, the former allows users to provide negative prompts and inform the model about content they do not want to see in the output. This feature aims to help users remove certain objects or adjust the generated style.

Meanwhile, the latter allows users to control the speed of output generation. According to Ideogram, the quick mode can generate an image in about five seconds but with basic quality, while the quality mode focuses on photo realism and details, taking approximately 20 seconds. The default mode strikes a balance between speed and quality, taking around 12 seconds.

Although it is currently unclear how many users will actually utilize these modes, Ideogram states that users can use these options to quickly generate basic images and iterate on them to achieve higher-quality results.

Improved photo realism and text rendering

Lastly, Ideogram also claims that with the latest updates, it has further enhanced its text rendering capabilities, reducing error rates by 15%. While this may seem like a small change, the company states that its performance in generating characters and words has surpassed that of DALL-3 Vivid.

Although Ideogram has not shared statistical data comparing its upgraded model with Midjourney, a leading AI image generation category, it does claim that the model provides stronger image coherence and photo realism in the output, preferred by more human evaluators.

"In terms of prompt alignment, image coherence, and text rendering quality, human evaluators prefer the images generated by the upgraded model, with a 30% to 50% improvement compared to the previous version," wrote the company in a blog post.

Currently, negative prompts and the new speed modes are only available to Ideogram Basic and Plus plan subscribers. As for the availability of the captioning feature for reference images, there is currently no clear information. However, we speculate that it may be free, as it is similar to the Remix feature offered by the company, which allows users to generate images similar to existing reference images. The improvements in text and image coherence are available to all users.