Stability AI releases StableLM Zephyr 3B: Small-Scale High-Performance Chat Language Model

2023-12-08

Stability AI is perhaps best known for its range of stable diffusion text-to-image generation AI models, but that is no longer the company's only business.

Recently, Stability AI released its latest model, StableLM Zephyr 3B, a large-scale language model (LLM) with 3 billion parameters, suitable for chat scenarios, including text generation, summarization, and content personalization. This new model is a smaller, optimized version of the StableLM text generation model that Stability AI first discussed in April.

The scale of StableLM Zephyr 3B is smaller than the 7 billion parameter StableLM model, which brings a series of benefits. The smaller model size allows it to be deployed on a wider range of hardware, reducing resource consumption while providing fast response. The model has been optimized for question answering and instruction-following tasks.

"StableLM has been trained with longer duration and higher quality data, for example, the number of tokens used is twice that of the LLaMA v2 7b, despite being only 40% of its scale, it is on par with it in terms of foundational performance," said Emad Mostaque, CEO of Stability AI.

About StableLM Zephyr 3B

StableLM Zephyr 3B is not a completely new model, but an extension of Stability AI's existing StableLM 3B-4e1t model.

Zephyr adopts a design approach inspired by the HuggingFace Zephyr 7B model, developed under the open-source MIT license, with the aim of serving as an assistant. Zephyr uses a training method called Direct Preference Optimization (DPO), which StableLM now also benefits from.

Mostaque explained that Direct Preference Optimization (DPO) is a different approach from the reinforcement learning used in previous models, used to fine-tune the model to meet human preferences. DPO is typically used for larger 7 billion parameter models, and StableLM Zephyr is one of the first smaller 3 billion parameter models to use this technique.

Stability AI used the UltraFeedback dataset from the OpenBMB research group for DPO. The UltraFeedback dataset includes over 64,000 prompts and 256,000 responses. The combination of DPO, smaller scale, and optimized training data set provides StableLM with solid performance in the metrics provided by Stability AI. For example, in the MT Bench evaluation, StableLM Zephyr 3B outperforms larger models including Meta's Llama-2-70b-chat and Anthropric's Claude-V1.

Stability AI's Growing Suite of Models

StableLM Zephyr 3B joins a series of new models released by Stability AI in recent months as the AI generation startup continues to advance its capabilities and tools.

In August, Stability AI released StableCode as a generative AI model for application code development. This was followed by the release of Stable Audio in September, a new text-to-audio generation tool. Then in November, the company jumped into the video generation field with a preview of Stable Video Diffusion.

While busy expanding into different domains, the new models do not mean that Stability AI has forgotten its foundation in text-to-image generation. Last week, Stability AI released SDXL Turbo as a faster version of its flagship SDXL text-to-image stable diffusion model.

Mostaque also made it clear that there will be more innovation to come from Stability AI.

"We believe that small, open, and high-performing models adjusted to users' own data will surpass larger general models," Mostaque said. "With the future full release of our brand-new StableLM model, we look forward to further democratizing generative language models."