"NVIDIA and Microsoft collaborate to launch AI Factory and new foundational models on Azure"

2023-11-16

Nvidia announced today that it will launch an AI factory on Microsoft's Azure cloud, allowing any customer to create their own AI for chatbots, image prediction, or generation.

The company announced this service at today's Microsoft Ignite 2023 conference, calling it the Nvidia-hosted Generative AI Factory on Azure, bringing together the company's best AI enterprise infrastructure and developer tools to improve developer productivity.

Manuvir Das, Vice President of Enterprise Computing at Nvidia, said, "Now, Azure provides the entire end-to-end workflow, including all infrastructure and software. This means that any customer can enter the Azure marketplace and obtain the components they need."

The offering includes the DGX Cloud platform, a powerful cloud hardware solution that allows companies to provide and run AI workloads on-demand on cloud infrastructure, now available through the Azure Marketplace. Companies can launch DGX Cloud instances with 8 A100 80 GB graphics processing units, which can be scaled across multiple nodes to provide high computing power for tasks such as training and fine-tuning AI. DGX Cloud was previously available on Oracle Cloud.

Nvidia also announced plans to introduce its newly launched H200 Tensor Core GPU on Azure next year to support larger workloads. This new GPU model is designed for the most demanding AI requirements, including LLM and generative AI models. It offers 141 GB of HBM3e memory, 1.8 times more than the previous generation, and achieves a memory bandwidth of 4.8 Tbps, a 1.4 times increase.

Nvidia-optimized AI base models

To support industry-customized generative AI models, Nvidia announced the launch of its own series of generative AI base models called Nemotron-3 8B, along with endpoints for optimizing open-source models.

The Nemotron-3 8B series is a set of 8 billion parameter LLM models optimized to run on Nvidia hardware, suitable for industry customers looking to build secure and reliable generative AI applications. These models support multiple languages and are trained on "responsible datasets," offering performance comparable to larger models in enterprise deployments.

The Nemotron-3 8B models are ready to use and proficient in over 50 different languages, including English, German, Russian, Spanish, French, Japanese, Chinese, Korean, Italian, and Dutch.

Erik Pounds, Senior Director of Nvidia AI Software, said, "The new Nvidia Nemotron-3 8B model series also includes models that support the creation of state-of-the-art enterprise chat and question-answering applications, applicable to various industries including healthcare, telecommunications, and financial services."

The company also offers a curated collection of optimized models, including popular community models such as Llama 2 by Meta Platform Inc., Mistral by Paris-based startup Mistral AI, and Stable Diffusion XL for image generation.

The base models can be customized with proprietary data from enterprise customers for specialization in specific use cases. Once customized to specific needs, they can be deployed almost anywhere for AI-based applications.