Recently, the Tencent Research Institute officially launched the new DRT-o1 series of models. This model, by incorporating long chain-of-thought (CoT) technology, significantly enhances the quality of literary translation, especially in handling complex rhetorical devices such as metaphors and similes.
Project Background
While neural machine translation (NMT) has made significant strides in everyday text translation, it still faces substantial challenges when dealing with rhetorical devices like metaphors and similes in literary works. These rhetorical devices often carry deep cultural and contextual meanings, which are difficult to convey accurately through simple literal translations. The DRT-o1 model, introduced by the Tencent Research Institute, aims to address this issue.
Project Introduction
To train the DRT-o1 model, researchers carefully selected 400 public domain English books from the Gutenberg Project, extracting 577,600 sentences. After rigorous screening, 63,000 sentences containing similes and metaphors were chosen as the training data, with the goal of teaching the model to "think deeply."
The DRT-o1 model employs an innovative multi-agent framework, consisting of three roles: translator, advisor, and evaluator. The translator is responsible for the initial translation, the advisor provides revision suggestions, and the evaluator scores the translation quality based on predefined metrics. Through the collaborative work of these three roles, the model can iteratively optimize and improve translation quality.
In terms of workflow, the DRT-o1 model consists of three main steps: keyword translation, initial translation, and a refinement loop. During the refinement loop, the advisor evaluates the previous step's translation and provides feedback, while the evaluator gives an overall score based on predefined criteria. The translator continuously adjusts the translation based on the feedback and scores until the score reaches a predefined threshold or the maximum number of iterations is reached.
Finally, the translation results are polished by GPT-4o to ensure fluency and readability. The dataset includes 22,264 machine-translated samples that have undergone deep thinking.
Example Demonstration
For example, the original sentence: "The mother, with her feet propped up on a stool, seemed to be trying to get to the bottom of that answer, whose feminine profundity had struck her all of a heap." The phrase "struck her all of a heap" is an idiom, indicating that something had a strong impact on her. The DRT-o1 model, through its long chain-of-thought process, translates it as: "The mother, with her feet resting on a stool, seemed to be trying to understand that answer, the profound femininity of which struck her with a sudden and powerful impact." This translation not only conveys the meaning of the original text but also preserves its emotional tone.
In comparison, translations by Google Translate and DeepL, while conveying the basic meaning, fall short in emotional expression and subtlety.
Performance
The DRT-o1 series has two versions: DRT-o1-7B and DRT-o1-14B. Experimental results show that compared to the Qwen2.5 series models, the DRT-o1 models have significantly improved BLEU scores and CometScores. Specifically, the DRT-o1-7B model achieved an 8.26-point increase in BLEU score and a 3.36-point increase in CometScore, while the DRT-o1-14B model saw a 7.33-point increase in BLEU score and a 1.66-point increase in CometScore. Notably, the DRT-o1-7B model outperformed the larger QwQ-32B model, with a 7.82-point higher BLEU score and a 1.46-point higher CometScore, demonstrating its powerful capability in handling complex language structures.
The Tencent Research Institute stated that the launch of the DRT-o1 series marks a significant advancement in the field of literary translation. They will continue to optimize and refine the model to provide even better translation services for cross-cultural literary exchanges.