Stability AI releases StableLM Zephyr 3B: Small-Scale High-Performance Chat Language Model AI NEWS

Home
AInews
Stability AI releases StableLM Zephyr 3B: Small-Scale High-Performance Chat Language Model

Stability AI releases StableLM Zephyr 3B: Small-Scale High-Performance Chat Language Model

2023-12-08

Stability AI is perhaps best known for its range of stable diffusion text-to-image generation AI models, but that is no longer the company's only business.

Recently, Stability AI released its latest model, StableLM Zephyr 3B, a large-scale language model (LLM) with 3 billion parameters, suitable for chat scenarios, including text generation, summarization, and content personalization. This new model is a smaller, optimized version of the StableLM text generation model that Stability AI first discussed in April.

The scale of StableLM Zephyr 3B is smaller than the 7 billion parameter StableLM model, which brings a series of benefits. The smaller model size allows it to be deployed on a wider range of hardware, reducing resource consumption while providing fast response. The model has been optimized for question answering and instruction-following tasks.

"StableLM has been trained with longer duration and higher quality data, for example, the number of tokens used is twice that of the LLaMA v2 7b, despite being only 40% of its scale, it is on par with it in terms of foundational performance," said Emad Mostaque, CEO of Stability AI.

About StableLM Zephyr 3B

StableLM Zephyr 3B is not a completely new model, but an extension of Stability AI's existing StableLM 3B-4e1t model.

Zephyr adopts a design approach inspired by the HuggingFace Zephyr 7B model, developed under the open-source MIT license, with the aim of serving as an assistant. Zephyr uses a training method called Direct Preference Optimization (DPO), which StableLM now also benefits from.

Mostaque explained that Direct Preference Optimization (DPO) is a different approach from the reinforcement learning used in previous models, used to fine-tune the model to meet human preferences. DPO is typically used for larger 7 billion parameter models, and StableLM Zephyr is one of the first smaller 3 billion parameter models to use this technique.

Stability AI used the UltraFeedback dataset from the OpenBMB research group for DPO. The UltraFeedback dataset includes over 64,000 prompts and 256,000 responses. The combination of DPO, smaller scale, and optimized training data set provides StableLM with solid performance in the metrics provided by Stability AI. For example, in the MT Bench evaluation, StableLM Zephyr 3B outperforms larger models including Meta's Llama-2-70b-chat and Anthropric's Claude-V1.

Stability AI's Growing Suite of Models

StableLM Zephyr 3B joins a series of new models released by Stability AI in recent months as the AI generation startup continues to advance its capabilities and tools.

In August, Stability AI released StableCode as a generative AI model for application code development. This was followed by the release of Stable Audio in September, a new text-to-audio generation tool. Then in November, the company jumped into the video generation field with a preview of Stable Video Diffusion.

While busy expanding into different domains, the new models do not mean that Stability AI has forgotten its foundation in text-to-image generation. Last week, Stability AI released SDXL Turbo as a faster version of its flagship SDXL text-to-image stable diffusion model.

Mostaque also made it clear that there will be more innovation to come from Stability AI.

"We believe that small, open, and high-performing models adjusted to users' own data will surpass larger general models," Mostaque said. "With the future full release of our brand-new StableLM model, we look forward to further democratizing generative language models."

DeepAI

Chat with AI for free

OpenRouter

Access every major AI model trough one platform

MINT AI

AI agents for optimizing advertising campaigns

Toki AI

Toki AI schedules events through messaging apps

Ikko Earbuds

Touchscreen translation assistant for AI earbuds

Action Figure Generator

Create custom collectible action figures made by AI

Spot AI

Transform cameras into smart video intelligence

RECENT AI TOOLS

Rithmm

DeepAI

OpenRouter

MINT AI

Toki AI

RECENT AI NEWS

Reddit Sues Perplexity and AI Data Scraping Companies for Unauthorized Use of Its Data

Google Cloud Launches Nvidia G4 AI Virtual Machines

Multiple Users Report ChatGPT's Impact on Mental Health, Seek Help from FTC

Meta Cuts 600 Jobs in Artificial Intelligence Division

Leena Opens "AI Colleague Studio" for Enterprise Agent Customization

OpenAI Requests List of Participants in ChatGPT Suicide Lawsuit Memorials

Amazon integrates AI with robotics and smart glasses to streamline delivery processes

Amazon Launches AI Smart Glasses for Delivery Drivers

RECENT AI TOOLS