Reasoning approaches based on large-scale language models (LLMs) simulate logical and problem-solving abilities by following structured methods. However, current methodologies are predominantly confined within the linguistic domain, where the reasoning process is articulated through explicit textual chains. While this approach effectively enhances clarity, it introduces inefficiency since natural language is inherently optimized for communication rather than reasoning. Neuroscientific studies further corroborate this perspective, revealing that reasoning often bypasses the language networks in the human brain. These insights highlight the potential for developing reasoning frameworks that transcend linguistic constraints, offering viable alternatives for LLMs.
Language-based reasoning methods face limitations in computational efficiency. When LLMs handle reasoning chains, most tokens contribute to fluency rather than actual reasoning, leading to a waste of computational resources. Moreover, essential reasoning steps require precise planning and decision-making, which existing architectures struggle to manage effectively. These inefficiencies become particularly pronounced when reasoning tasks become complex or require simultaneous exploration of multiple solutions. Additionally, language-based models tend to prematurely lock into a single deterministic path, restricting their ability to backtrack or consider alternative solutions, thereby diminishing their effectiveness in addressing dynamic or exploratory problems.
Chain-of-Thought (CoT) reasoning approaches have garnered significant attention as a means to address these inefficiencies. By guiding LLMs to incrementally generate intermediate solutions in language, CoT enhances the clarity and precision of problem-solving. Nevertheless, it remains constrained by the limitations of natural language and performs poorly on tasks requiring complex planning or exploration. Recent innovations have sought to integrate latent reasoning, which allows models to perform non-linguistic computations. Although progress has been made, latent reasoning methods still require greater scalability and robustness to surpass traditional language-based approaches across various tasks.
To tackle these challenges, researchers from Meta's FAIR and UC San Diego have introduced COCONUT (Continuous Chain-of-Thought). COCONUT presents a novel paradigm that enables LLMs to reason within an unrestricted latent space, thereby circumventing the constraints of language. Unlike traditional CoT, which encodes the reasoning state as lexical tokens, COCONUT leverages the LLM's final hidden states as a continuous representation of the reasoning state. This representation, termed "continuous thought," can be directly fed back into the model for further processing without the need for linguistic decoding. In this manner, COCONUT allows models to handle reasoning steps efficiently while maintaining the ability to explore multiple solution paths.
COCONUT employs a multi-stage training process to enhance its latent reasoning capabilities. During training, the model alternates between linguistic and latent modes, progressively replacing language-based reasoning steps with latent representations. For instance, in the final training phase, COCONUT substitutes all reasoning chains with continuous thought, enabling the model to solve problems entirely within the latent space. This approach is analogous to breadth-first search (BFS), where the model evaluates multiple reasoning paths concurrently before narrowing down to the most promising solutions. This flexibility allows COCONUT to address complex tasks that require extensive planning and decision-making.
To validate the effectiveness of COCONUT, researchers conducted experiments on three datasets:
- GSM8k dataset for mathematical reasoning;
- ProntoQA dataset for logical reasoning;
- ProsQA dataset, a new dataset that requires advanced planning of graph structures.
The results demonstrate that COCONUT outperforms traditional CoT methods in both accuracy and efficiency. In logical reasoning tasks, COCONUT achieved an accuracy of 99.9%, surpassing CoT's 98.8%, while also generating fewer reasoning tokens during the process. On the ProsQA dataset, COCONUT exhibited significant advantages in tasks that require extensive planning, outperforming CoT and attaining higher accuracy with reduced computational resources.
The key highlight of COCONUT lies in its ability to encode multiple reasoning paths simultaneously. By treating the reasoning state as continuous thought, the model avoids prematurely committing to specific solutions. Instead, it maintains a distribution of potential next actions and gradually filters out erroneous paths. This approach exhibits particularly effective results in open-domain reasoning tasks, such as GSM8k, where COCONUT achieved an accuracy of 42.9% compared to CoT's 42.0%. The flexibility to explore and backtrack within the latent space endows COCONUT with outstanding planning capabilities, making it suitable for tasks involving uncertainty or multiple solution pathways.
The main highlights of the COCONUT study are as follows:
- · COCONUT achieved 99.9% accuracy on the logical reasoning task (ProntoQA) and 42.9% accuracy on the mathematical reasoning task (GSM8k), outperforming traditional methods in both.
- · The model reduced the number of reasoning tokens generated during the reasoning process, demonstrating computational efficiency.
- · COCONUT's latent space reasoning simulates BFS, enabling the model to explore multiple solutions and adapt to complex tasks.
- · The multi-stage training process allows COCONUT to tackle more challenging problems while maintaining high performance.
- · COCONUT performs exceptionally well across various reasoning tasks, ranging from open-domain mathematical problems to logic reasoning with graph structures.
In conclusion, by introducing continuous latent thought, COCONUT overcomes the inefficiencies of language-based methods and enhances computational efficiency. Its ability to encode and explore multiple reasoning paths makes it an ideal choice for solving complex problems. Consequently, COCONUT sets a new benchmark for machine reasoning in logical reasoning and efficient token utilization.