Nvidia has recently launched a new microservice designed to assist AI engineers in building generative AI applications capable of storing and retrieving data across different languages, making it easier to transcend national boundaries.
To enhance the accuracy of data retrieval for generative AI in multilingual environments, Nvidia has introduced the multilingual-capable NeMo Retriever software through its developer application interface catalog. This software can understand and process data in various languages and formats, converting it into text to provide context-aware search results.
NeMo Retriever enables developers to build information ingestion and retrieval pipelines for AI models, extracting structured and unstructured data from texts, documents, and tables, while avoiding duplicate content. It uses embedding technology to transform data into a language that AI can understand and store in a vector database.
Embedding technology is a sophisticated mathematical representation method used to show the attributes and relationships between data such as words and phrases. This helps capture the "closeness" in meaning between two words or sentences during searches or thought processes. For example, "cat" and "dog" are considered close because they are both animals and pets, while "toaster" and "dog" are far apart despite being common household items but belonging to different categories.
Kari Briski, Vice President of Generative AI Software at Nvidia, stated in an interview that using Retriever to embed and retrieve data in its native language improves accuracy. This is primarily because most AI training datasets are predominantly in English. The "information loss" that occurs during translation can lead to a loss of context or accuracy with each conversion.
Briski noted that when Retriever was first released, customers urgently requested multilingual support due to the decrease in accuracy caused by translation software. Businesses often operate in multiple languages, embedding English documents, German tests, Japanese materials, or Russian research reports. These pieces of information need to be searchable by the same model, but the more tools involved, the lower the accuracy becomes.
In addition to ingestion, NeMo Retriever can "evaluate and re-rank" results to ensure the accuracy of the answers. When a query passes through Retriever, it checks the response from the vector database and ranks the retrieved information based on its relevance to the query.
Nvidia collaborated with DataStax to use NeMo Retriever to convert 10 million Wikipedia entries into an AI-ready format in less than three days, a process that typically takes 30 days.
Furthermore, Nvidia's partners, including Cohesity, Cloudera, SAP, and VAST Data, are integrating support for these new microservices to handle large multilingual data sources. This includes services like retrieval-augmented generation, which allows pre-trained generative AI to access richer and more relevant information from real-time data sources. The application of multilingual sources enables businesses to access more data.
Currently, the multilingual version of NeMo Retriever supports only text retrieval and responses. Briski mentioned that the company is researching future applications in areas such as multimodal data, images, PDFs, and videos. "We are currently focusing on text because if you can do well with text, you can achieve good results with other forms of data as well," she said.