Microsoft Utilizes LASER Technology to Mitigate Inaccuracies in Large Language Models

2024-02-01

During the Microsoft Research Forum in January, Dipendra Misra, a senior researcher at Microsoft Research Lab New York and AI Frontier, explained how LASER can make large language models more accurate. By using LASER, researchers can "intervene" and replace a weight matrix with an approximate smaller matrix. Weights are the contextual connections established by the model. The larger the weight, the stronger the model's dependency on it. So, does replacing something with something else that has more associations and context make the model inaccurate? According to their test results, the answer is surprisingly no. Misra said, "We are intervening with LASER in LLM, so people would expect that as we do more approximations, the model loss should increase, meaning the model's performance would worsen, right? Because we are discarding information from LLM, which is trained on a large amount of data. But what surprised us is that we found that if the right type of LASER intervention is performed, the model loss does not increase, in fact, it decreases." Misra stated that his team successfully used LASER on three different open-source models: RoBERTa, Llama 2, and Eleuther's GPT-J. He said that sometimes the model improvements increased by 20 to 30 percentage points. For example, GPT-J's accuracy in gender prediction based on biographies increased from 70.9% to 97.5% after LASER intervention. Artificial intelligence models often make factual errors, so the accuracy of large language models remains an issue. It is not just about the fear of generating illusions; illusions are more about misunderstanding things rather than fabricating them. Illusions and inaccurate AI models can potentially cause harm.