DeepSeek Releases New R1 Series of Inference-Optimized Large Language Models AI NEWS

Home
AInews
DeepSeek Releases New R1 Series of Inference-Optimized Large Language Models

DeepSeek Releases New R1 Series of Inference-Optimized Large Language Models

2025-01-21

Recently, a Chinese AI company named DeepSeek has launched a series of large language models specifically designed for reasoning tasks - the R1 series. The source code of these algorithms has been made publicly available on the Hugging Face platform.

R1 series primarily consists of two algorithms: R1 and R1-Zero. According to DeepSeek, R1 has outperformed OpenAI's o1 model in multiple reasoning benchmarks. Although R1-Zero is slightly less capable, it holds significant potential for machine learning research.

Both large language models employ a Mixture of Experts (MoE) architecture, incorporating 671 billion parameters. An MoE model comprises several neural networks, each optimized for different task sets. When processing a prompt, a routing mechanism directs queries to the most suitable neural network.

The key advantage of the MoE architecture lies in reducing inference costs. In MoE models, user inputs activate only the specific neural networks required for generating responses, rather than the entire AI system. Consequently, R1 and R1-Zero activate less than one-tenth of their total parameters when responding to prompts.

During the training of R1-Zero, DeepSeek adopted an unconventional approach compared to typical reasoning model training methods. Generally, large language models optimized for reasoning are trained through reinforcement learning and supervised fine-tuning. Reinforcement learning teaches AI models to perform tasks via trial and error, while supervised fine-tuning improves output quality by providing examples of task execution.

In contrast, DeepSeek skipped the supervised fine-tuning phase during R1-Zero's training. Despite this omission, the model still acquired reasoning skills such as breaking down complex tasks into simpler subtasks. This marks the first study validating that large language models can gain reasoning abilities solely through reinforcement learning without supervised fine-tuning.

Although R1-Zero boasts advanced features, its output quality is limited, exhibiting issues like "infinite repetition, poor readability, and language mixing." To address these limitations, DeepSeek developed R1, an enhanced version of R1-Zero with a modified training process that includes the previously skipped supervised fine-tuning stage, significantly improving output quality.

DeepSeek conducted nearly twenty benchmark tests comparing R1 against four popular large language models. Results showed that R1 surpassed OpenAI's reasoning-optimized model o1 in multiple benchmarks. Even in benchmarks where o1 scored higher, R1's score difference was within 5%.

Notably, R1 outperformed o1 in the LiveCodeBench benchmark, which includes programming tasks regularly updated with new exercises, reducing the likelihood of AI models finding readily available answers on the public internet.

Furthermore, DeepSeek released a range of weaker yet more hardware-efficient models distilled from R1. These models are based on the Llama and Qwen open-source large language model families, with parameter sizes ranging from 1.5 billion to 70 billion. Among them, the R1-Distill-Qwen-32B model outperformed the scaled-down version of OpenAI-o1-mini in several benchmarks.

Jules

Jules - AI coding assistant with automatic pull requests

Final Round AI

Final Round AI - Automated job interview preparation and assistance

Sapia

Sapia - AI hiring agent for fair recruitment processes

Magic Motion

Magic Motion - AI transforms text into engaging 3D animations

Recall

Recall - AI summarizer for streamlined knowledge management

Rocket.new

Rocket.new - AI analyzes and summarizes call conversations

Qodo AI Platform

Qodo AI Platform - AI tool for ensuring code quality and integrity

RECENT AI TOOLS

Interviewer AI

Jules

Final Round AI

Sapia

Magic Motion

RECENT AI NEWS

X Trial AI Chatbot Drives Community Notes Initiative

Amazon Deploys One Millionth Robot and Unveils Generative AI Model

Google’s Agent2Agent Protocol Joins Linux Foundation

Elon Musk's xAI Raises $10 Billion to Upgrade AI Infrastructure

Calling the Algorithm Doctor: Microsoft's AI Diagnoses Like House MD, Prices Like Costco

Cloudflare Halts AI Crawlers, Gaining Industry Applause

Google DeepMind Releases AlphaGenome: Unified AI Model for High-Resolution Genomic Interpretation

Cursor Launches Web Application for Managing AI Coding Agents

RECENT AI TOOLS