Stability AI Launches Next-Generation Image Generation Model

2024-10-23

After navigating a series of controversies caused by technical glitches and licensing changes, AI startup Stability AI has unveiled its latest image generation model series, Stable Diffusion 3.5.

The Stable Diffusion 3.5 series offers enhancements in customization, versatility, and performance over Stability AI's previous technologies. This series comprises three models:

  • Stable Diffusion 3.5 Large: Featuring 8 billion parameters, this is the most powerful model in the series, capable of generating images with resolutions up to 1 million pixels. Generally, a higher number of parameters correlates with better problem-solving capabilities and enhanced model performance.
  • Stable Diffusion 3.5 Large Turbo: A streamlined version of Stable Diffusion 3.5 Large, it offers faster image generation speeds at the expense of some quality.
  • Stable Diffusion 3.5 Medium: Optimized for edge devices such as smartphones and laptops, this model can generate images with resolutions ranging from 250,000 to 2 million pixels. It is scheduled for release on October 29.

Stability AI claims that the Stable Diffusion 3.5 series is capable of producing more diverse outputs, generating images of individuals with varying skin tones and features without the need for extensive prompts.

In an interview, Stability AI's Chief Technology Officer, Hanno Basse, stated that during training, each image was accompanied by multiple versions of prompts, with shorter prompts being prioritized. This approach ensures a more extensive and diverse distribution of image concepts for any given text description. Like many generative AI companies, Stability AI's training data comprises curated publicly available datasets and synthetic data.

Stability AI's predecessor flagship image generator, Stable Diffusion 3 Medium, faced widespread criticism for unusual artifacts and poor adherence to prompts. The company cautions that the Stable Diffusion 3.5 series models may exhibit similar prompt-related errors due to engineering and architectural trade-offs. However, Stability AI also asserts that these models are more robust than their predecessors in generating images across various styles, including 3D art.

The Stable Diffusion 3.5 series models are available for free for non-commercial purposes, including research. Businesses with annual revenues below $1 million may also commercialize them at no cost. However, organizations with annual revenues exceeding $1 million are required to enter into an enterprise licensing agreement with Stability AI.

This summer, Stability AI sparked controversy over stringent fine-tuning terms that appeared to grant the company the authority to charge for models trained using its image generators. After facing strong opposition, the company revised these terms to permit more flexible commercial usage. Stability AI reiterated that users retain ownership of the media generated using its models.

The Stable Diffusion 3.5 Large and 3.5 Large Turbo models can be self-hosted or accessed through Stability AI's API as well as third-party platforms such as Hugging Face, Fireworks, Replicate, and ComfyUI. Stability AI announced plans to release ControlNets for these models in the coming days to facilitate fine-tuning.

Like most AI models, Stability AI's models are trained on publicly available web data, some of which may be copyrighted or subject to restrictive licenses. Stability AI and many other AI providers contend that the principle of fair use shields them from copyright infringement claims. However, this has not prevented data owners from initiating an increasing number of class-action lawsuits.

Stability AI requires customers to handle copyright claims independently and, unlike some other providers, does not include indemnification clauses in cases where it is found liable. However, Stability AI permits data owners to request the removal of their data from its training datasets. As of March 2023, artists have removed 80 million images from Stable Diffusion's training data.

When asked about information security measures for the upcoming U.S. elections, Stability AI stated that it has taken and will continue to implement reasonable measures to prevent the misuse of Stable Diffusion by malicious actors. However, the company declined to disclose specific technical details regarding these measures.