Meta R&D System 2 Distillation Technology Enhances LLM Reasoning Power

2024-07-15

When discussing the capabilities of large language models (LLMs), it is easy to see that they excel in answering simple everyday questions, but struggle when faced with tasks requiring deep logical reasoning and complex planning. To address this, the research community has introduced a special prompting strategy called "System 2," which significantly enhances the reasoning abilities of LLMs by requiring them to gradually display the intermediate steps of problem-solving. However, while effective, this method also brings about issues of slower processing speed and increased computational costs.


Interestingly, inspiration comes from the "dual-system" theory in cognitive science: System 1 represents fast, intuitive thinking, while System 2 is responsible for slow but in-depth analysis tasks. LLMs are often seen as an extension of System 1, excelling in immediate responses but lacking in deep thinking. As a result, AI researchers have developed various System 2 prompting techniques, such as "thinking chains," aimed at improving the logical accuracy of LLMs through step-by-step reasoning.

Now, the research team at Meta FAIR has innovatively proposed the "System 2 distillation" technique, which cleverly bypasses the limitations of relying on intermediate steps, allowing LLMs to directly acquire the ability to handle complex tasks without sacrificing accuracy. This technique is inspired by human learning: through repeated practice, System 2 tasks that originally required deliberate effort can eventually be transformed into natural responses of System 1, such as the acquisition of driving skills.


The core of System 2 distillation lies in utilizing the knowledge obtained by LLMs through System 2 reasoning and directly integrating these complex reasoning abilities into its efficient System 1 generation mechanism through a process similar to "knowledge purification." Specifically, it involves using System 2 prompts to have LLMs answer questions, ensuring the accuracy of the answers through self-consistency verification. Then, the lengthy reasoning process is discarded, retaining only the final result, and the model is fine-tuned to enable it to provide answers directly and quickly.

The experiments have shown that the System 2 distillation technique not only significantly improves the performance of LLMs in complex reasoning tasks but sometimes even surpasses the original System 2 methods, with faster response times and lower computational costs. Whether it is dealing with biased viewpoints, optimizing answer quality, or conducting fine evaluations, this technique has demonstrated extraordinary effectiveness. The research team points out that this discovery opens up new possibilities for the application of LLMs in complex scenarios, allowing models to handle various challenges more efficiently.

Of course, System 2 distillation is not a panacea. Research has also found that not all types of reasoning skills can be seamlessly distilled into the fast reasoning mechanism of LLMs, especially complex mathematical reasoning tasks that heavily rely on thinking chain prompts. This suggests that certain tasks may always require more detailed reasoning processes.

In addition, further research on System 2 distillation needs to explore its performance on smaller models and the impact of the distillation process on the model's generalization ability for unknown tasks. At the same time, caution must be exercised regarding potential biases in LLM benchmark testing to ensure fair and accurate evaluation results.

Nevertheless, System 2 distillation undoubtedly opens up a hopeful new path for optimizing and applying LLMs. Looking ahead, this technology is expected to make LLMs operate more efficiently in their areas of expertise, while freeing up more resources to tackle those challenging problems that still remain, just like the process of humans constantly pushing their own limits.