Open R1: Hugging Face Replicates the Training Process of DeepSeek-R1

Latest AI Resources7mos agorelease AI Sharing Circle

2.6K 00

General Introduction

Hugging Face's Open R1 project is a fully open-source DeepSeek-R1 replication project designed to build the missing pieces of the R1 pipeline so that everyone can replicate and build upon them. The project is designed to be simple and consists mainly of scripts for training and evaluating models as well as generating synthetic data.The goal of the Open R1 project is to demonstrate the process of reproducing the complete R1 pipeline through multi-stage training, from the base model to the reinforcement-learning tuned model. The project includes detailed installation and usage instructions, and supports community contributions and collaboration.

We are going to start with DeepSeek-R1 The technical report serves as a guide, which can be broadly broken down into three main steps:

Step 1: Replicate the R1-Distill model by extracting a high-quality corpus from DeepSeek-R1.

Step 2: Replication DeepSeek Pure Reinforcement Learning (RL) process for creating R1-Zero. This may require organizing new large-scale datasets for math, inference, and code.

Step 3: Demonstrate that we can transition from a base model to an RL-tuned model through multi-stage training.

Open R1：Hugging Face 复现 DeepSeek-R1 的训练过程

Function List

model training: Provides scripts for training models, including GRPO and SFT training methods.
Model Evaluation: Provides scripts for evaluating model performance and supports R1 benchmarking.
Data generation: Scripts for generating synthetic data using Distilabel.
Multi-stage training: Demonstrate a multi-stage training process from base modeling to reinforcement learning tuning.
Community Contributions: Support community members to contribute datasets and model improvements.

Using Help

Installation process

Creating a Python Virtual Environment::

   conda create -n openr1 python=3.11
conda activate openr1

Installing vLLM::

   pip install vllm==0.6.6.post1

This will install PyTorch v2.5.1 at the same time, make sure to use this version for compatibility with vLLM binaries.

Install project dependencies::

   pip install -e ".[dev]"

Login to Hugging Face and Weights and Biases accounts::

   huggingface-cli login
wandb login

Installing Git LFS::

   sudo apt-get install git-lfs

Guidelines for use

training model::

Use GRPO to train the model:

 python src/open_r1/grpo.py --dataset <dataset_path>

Train the model using SFT:

 python src/open_r1/sft.py --dataset <dataset_path>

assessment model::

   python src/open_r1/evaluate.py --model <model_path> --benchmark <benchmark_name>

Generating synthetic data::

   python src/open_r1/generate.py --model <model_path> --output <output_path>

Multi-stage training::
- Step 1: Replicate the R1-Distill model: bash python src/open_r1/distill.py --corpus <corpus_path>
- Step 2: Replicate the pure RL pipeline: bash python src/open_r1/rl_pipeline.py --dataset <dataset_path>
- Step 3: From base model to RL tuning: bash python src/open_r1/multi_stage_training.py --model <model_path>

Contribution Guidelines

Project Fork: fork the project to your own account on GitHub.
cloning project::

   git clone https://github.com/<your_username>/open-r1.git

Creating a new branch::

   git checkout -b new-feature

Submit changes::

   git add .
git commit -m "Add new feature"
git push origin new-feature

Creating a Pull Request: Submit a Pull Request on GitHub describing the changes made.

Latest AI Resources # AI Java Open Source Projecct

The article is copyrighted and should not be reproduced without permission.

EXAONE 4.0 - Hybrid Reasoning Model introduced by LG

Latest AI Resources

3wks ago

0580

Noi: a multi-window dialog browser with integrated AI dialog tools

Latest AI Resources # AI Localized Chat Application

9mos ago

02.2K

Scouting Rice - AI food recommendation tool launched by Wordpress

Latest AI Resources

2mos ago

0672

Reve.art: an image generation platform that combines aesthetics and camera sense

Latest AI Resources # AI online image generation

5mos ago

02.1K

No comments

You must be logged in to leave a comment!

No comments...

Open R1: Hugging Face Replicates the Training Process of DeepSeek-R1

General Introduction

Function List

Using Help

Installation process

Guidelines for use

Contribution Guidelines

Open Operator: Performing Automation in Cloud Browsers with AI Intelligence

TinyZero: A Low-Cost Replication of DeepSeeK-R1 Zero's Epiphany Effect

Related posts

EXAONE 4.0 - Hybrid Reasoning Model introduced by LG

Noi: a multi-window dialog browser with integrated AI dialog tools

Scouting Rice - AI food recommendation tool launched by Wordpress

Reve.art: an image generation platform that combines aesthetics and camera sense

No comments

Latest Collections

Latest Articles

Open R1: Hugging Face Replicates the Training Process of DeepSeek-R1

General Introduction

Function List

Using Help

Installation process

Guidelines for use

Contribution Guidelines

Open Operator: Performing Automation in Cloud Browsers with AI Intelligence

TinyZero: A Low-Cost Replication of DeepSeeK-R1 Zero's Epiphany Effect

Related posts

EXAONE 4.0 - Hybrid Reasoning Model introduced by LG

Noi: a multi-window dialog browser with integrated AI dialog tools

Scouting Rice - AI food recommendation tool launched by Wordpress

Reve.art: an image generation platform that combines aesthetics and camera sense

No comments

Selected AI Tools

Latest Collections

Latest Articles