Snowflake releases open-source enterprise LLM with 480 billion parameters - Arctic

2024-04-25

Snowflake has added a new LLM to its enterprise use cases after open-sourcing the Arctic series of text embedding models. Snowflake Arctic sets new standards for openness and enterprise-level performance.


Arctic is designed with a unique Mixture-of-Experts (MoE) architecture that provides top-level optimization for complex enterprise workloads, surpassing multiple industry benchmarks in SQL code generation, instruction adherence, and more.


Arctic's unique MoE design enhances training system and model performance through carefully designed, custom data compositions tailored to enterprise needs. With breakthrough efficiency, Arctic activates only 17 out of 480 billion parameters at a time, achieving industry-leading quality and unprecedented token efficiency.


"Despite the computing budget being reduced to one-sixteenth of the original, Arctic is on par with Llama3 70B in language understanding and reasoning, and surpasses it in enterprise metrics," said Baris Gultekin, Snowflake's AI lead.


Compared to other models, Arctic activates about 50% fewer parameters during inference or training than DBRX and 80% fewer than Grok-1. Additionally, it outperforms leading open-source models such as DBRX, Llama 2 70B, and Mixtral-8x7B in encoding (HumanEval+, MBPP+), SQL generation (Spider and Bird-SQL), and general language understanding (MMLU).


"For Snowflake, this is a milestone moment as our AI research team innovates at the forefront of AI," said Sridhar Ramaswamy, CEO of Snowflake. "We provide industry-leading intelligence and efficiency to the AI community in a truly open manner, further pushing the boundaries of what open-source AI can achieve. Our research on Arctic will significantly enhance our ability to deliver reliable and efficient AI to our customers."


Best Open-Source Model?


What's even better is that Snowflake has released the weights of Arctic, as well as the research details behind its training, under the Apache 2.0 license, establishing a new level of openness for enterprise AI technology. "With the Apache 2 licensed Snowflake Arctic embedding model series, organizations now have a more open alternative to black-box API providers like Cohere, OpenAI, or Google," said Snowflake.


"The continuous development and healthy competition of open-source AI models are crucial not only for the success of Perplexity but also for the future democratization of generative AI for everyone," said Aravind Srinivas, Co-founder and CEO of Perplexity. "We look forward to experimenting with Snowflake Arctic to customize it for our products and ultimately create greater value for our end users."


As part of the Snowflake Arctic model series, Arctic is the most open LLM currently available, allowing unrestricted personal, research, and commercial use under the Apache 2.0 license. Snowflake goes further by providing code templates and flexible inference and training options, enabling users to quickly deploy and customize Arctic using their preferred frameworks, including NVIDIA NIM with NVIDIA TensorRT-LLM, vLLM, and Hugging Face.


Yoav Shoham, Co-founder and Co-CEO of AI21 Labs, said, "We are delighted to see Snowflake empowering enterprises to harness the power of open-source models, just like our recently released Jamba, the first production-grade Transformer-SSM model based on Mamba."


For immediate use, Arctic is now available in Snowflake Cortex as serverless inference, a fully managed service provided by Snowflake that offers machine learning and AI solutions in the data cloud, alongside other model libraries and directories such as Hugging Face, Lamini, Microsoft Azure, NVIDIA API Catalog, Perplexity, and Together.