Mistral launches new fine-tuning tool to streamline model customization process

2024-06-06

Tweaking is crucial for improving the output performance of Large Language Models (LLMs) and adapting them to specific business needs. When done correctly, this process can make the models more accurate, useful, and allow organizations to derive more value and precision from their generative AI applications.

However, fine-tuning is not an easy task and requires a significant investment, which discourages some businesses.


Open-source AI model provider Mistral, valued at $6 billion just 14 months after its establishment, is now venturing into the fine-tuning field by offering new customization features on its AI developer platform, La Plateforme.

The company claims that these new tools provide efficient fine-tuning services, reducing training costs and lowering the entry barrier.

This French company lives up to its name - "mistral" refers to a strong wind in the south of France - constantly introducing innovations and attracting millions of dollars in funding.

Mistral announced its new service in a blog post, stating, "When we customize smaller models to fit specific domains or use cases, it can offer performance on par with larger models, reducing deployment costs and improving application speed."

Customizing Mistral Models

Mistral gained recognition by releasing powerful Large Language Models (LLMs) under open-source licenses, which means these models can be obtained, adapted, and used for free.

However, the company also offers paid tools such as APIs and the developer platform "La Plateforme" to facilitate the development process for those who want to build on their models. Users can build applications on Mistral by making API calls without deploying their own version of Mistral LLM on their servers. Price details can be found here (scroll to the bottom of the linked page).


Now, in addition to building on existing products, customers can also customize Mistral models on La Plateforme using the open-source code provided by Mistral on GitHub or on their own infrastructure. Alternatively, they can opt for customized training services.

For developers who prefer working on their own infrastructure, Mistral has released a lightweight code library called mistral-finetune. It is based on the LoRA paradigm, reducing the number of trainable parameters required by the model.

Mistral stated in the blog post, "With mistral-finetune, you can fine-tune all our open-source models on your own infrastructure without sacrificing performance or memory efficiency."

Furthermore, for customers seeking serverless fine-tuning services, Mistral now offers a new service utilizing its internally developed advanced technology. Mistral claims that the underlying LoRA adapter helps prevent the model from forgetting the foundational knowledge while achieving efficient services.

"This is a new step in our mission to showcase advanced scientific methods to AI application developers," the company wrote in the blog post, highlighting that this service allows for rapid and cost-effective model adaptation.

The fine-tuning service is compatible with Mistral's 7.3B parameter models, Mistral 7B, and Mistral Small. Current users can immediately customize their models using Mistral's API, and the company plans to expand its fine-tuning service to new models in the coming weeks.

Finally, the customized training service fine-tunes Mistral AI models on the customer's specific applications using proprietary data. The company often employs advanced techniques such as continual pre-training to incorporate proprietary knowledge into the model weights.

"This approach enables them to create highly specialized and optimized models for their specific domains," according to Mistral's blog post.

To coincide with today's announcement, Mistral has also launched an AI fine-tuning hackathon. The competition will run until June 30th and allows developers to try out the new fine-tuning API provided by this startup.


Mistral Continues to Accelerate Innovation and Attract Funding

Since its establishment in April 2023, Mistral, founded by former Google DeepMind and Meta employees Arthur Mensch, Guillaume Lample, and Timothée Lacroix, has experienced unprecedented rapid growth in just 14 months.

The company is known for its record-breaking $118 million seed funding - reportedly the largest in European history - and its partnerships with companies like IBM shortly after its establishment. In February of this year, the company released Mistral Large through Azure Cloud in collaboration with Microsoft.

Just yesterday, SAP and Cisco announced their support for Mistral, and the company launched its first code-centric LLM, Codestral, at the end of last month, claiming superior performance compared to all other products. It is reported that the company is also close to completing a new funding round of $600 million, which would bring its valuation to $6 billion.

Mistral Large competes directly with OpenAI's Llama 3 and Meta's Llama, and according to the company's benchmark tests, it is the world's second-largest and powerful commercial language model, surpassed only by OpenAI's GPT-4.

Mistral 7B was launched in September 2023, and the company claims it outperforms Llama in many benchmark tests and approaches the performance of CodeLlama 7B in terms of code.

What will be Mistral's next move? We will soon find out.