AI Personal Learning
and practical guidance
Beanbag Marscode1

DeepResearcher: driving AI to study complex problems based on reinforcement learning

General Introduction

DeepResearcher is an open source project developed by the GAIR-NLP team at Shanghai Jiao Tong University. It is an intelligent research tool based on Large Language Models (LLMs), trained end-to-end in a real network environment through Reinforcement Learning (RL). The project aims to help users efficiently complete complex research tasks. It automatically searches for information, verifies data accuracy, and generates detailed results.DeepResearcher supports 7B parametric models and has been open sourced on Hugging Face. The code is available via GitHub and is suitable for researchers, students and technology enthusiasts.

DeepResearcher: driving AI to study complex problems based on reinforcement learning-1


DeepResearcher: driving AI to study complex problems based on reinforcement learning-1

 

Function List

  • Automation Research: When a question is entered, the web is automatically searched and relevant information is organized.
  • cross-source authentication: Check data from multiple sources (e.g. Google, Bing) to ensure reliable results.
  • Self-reflective adjustments: Self-assessment based on search results and redirection of research to improve accuracy.
  • Development of a research program: Automatically generate research steps when dealing with complex problems.
  • Keep it honest.: Limitations are stated directly when no clear answer can be found.
  • Open Source Modeling Support: 7B parametric models are available for download and customization by the user.

 

Using Help

Installation and use of DeepResearcher requires a certain level of technical knowledge, but the official documentation provides clear guidelines. Below are detailed steps to help users get started quickly.

Installation process

  1. Clone Code Repository
    Run the following command in the terminal to download the project locally:
git clone https://github.com/GAIR-NLP/DeepResearcher.git

Go to the project catalog:

cd DeepResearcher
  1. Creating a Virtual Environment
    Use conda to create a separate Python environment and avoid dependency conflicts:
conda create -n deepresearcher python=3.10

Activate the environment:

conda activate deepresearcher
  1. Installing core dependencies
    Install PyTorch and other necessary libraries by running the following commands in sequence in the project root directory:
pip3 install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu124
pip3 install flash-attn --no-build-isolation
cd verl
pip3 install -e .
cd ../
pip3 install -r requirements.txt

These steps ensure that the base environment required for the model to run is in place.

  1. Verify Installation
    Enter the following command to check if PyTorch is installed properly:
python -c "import torch; print(torch.__version__)"

If the version number is displayed (e.g. 2.4.0), the installation was successful.

Configuration and Startup

DeepResearcher uses the Ray framework for training and inference, and also requires configuration of the search service. Here's how to do it.

Starting the Ray Service

  1. Setting Node Ranking
    Enter the following command in the terminal to set the node number (this is required even if there is only one machine):
export PET_NODE_RANK=0
ray start --head
  1. Configuring Search Services
  • show (a ticket) ./scrl/handler/config.yamlIf you want to modify the search API key, you can do so by clicking on the "Search API" button:
    • Using the Serper API: fill in the serper_api_keyThe
    • Use Azure Bing: fill in the azure_bing_search_subscription_key and set search_engine for Bing.
  • compiler ./scrl/handler/server_handler.pyIf you want to add a Qwen-Plus API key, add the Qwen-Plus API key:
    client = OpenAI(
    api_key="sk-xxx",
    base_url="xxxx"
    )
    
  1. Starting the Service Processor
    Runs in the terminal:
python ./scrl/handler/server_handler.py

After the service is started, the service address is recorded and updated ./scrl/handler/config.yaml hit the nail on the head server_url_listThe

  1. Running the main processor
    running on the training host:
python ./scrl/handler/handler.py

training model

  1. Execution of training scripts
    Run it in the project root directory:
bash train_grpo.sh

The training process will optimize the model based on reinforcement learning and requires patience.

Use and Reasoning

  1. Generating research results
    Run the evaluation script:
bash evaluate.sh

The output file is saved in the ./outputs/{project_name}/{experiment_name}/rollout/rollout_step_0.jsonThe

  1. View Results
    Rename the output file to {experiment_name}_result.jsonMove to ./evaluate/ folder and run it:
python ./evaluate/cacluate_metrics.py {experiment_name}

The score is saved in the ./evaluate/{experiment_name}_score.jsonThe

Featured Function Operation

  • Automated research and cross-source validation
    After the user enters a question, DeepResearcher collects data from configured search engines (e.g. Google, Bing) and cross-validates the results. Log files ./outputs/research_log.txt The validation process will be documented.
  • Self-reflective adjustments
    If the initial results are not satisfactory, the system will automatically adjust the keywords or search strategy. For example, typing "AI application in medical treatment" may change to "AI medical latest technology", and the results will be more accurate.
  • Keep it honest.
    When there is no clear answer to a question, it returns something like "there is not enough information to give a definite conclusion" instead of guessing.

caveat

  • Ensure that your internet connection is stable and that the search function relies on real-time data.
  • Training and inference require high computational resources and GPUs are recommended.
  • The project is still under development, so we recommend following the updates on GitHub.

With these steps, users can easily install and use DeepResearcher to experience its intelligent research capabilities.

 

application scenario

  1. academic research
    Researchers can use it to search for paper material, verify sources, and generate first drafts of research reports.
  2. Student Learning
    Students can use it to organize course-related knowledge and quickly complete assignments or project research.
  3. technology development
    Developers can use it to explore technology trends and get industry updates and solutions.

 

QA

  1. Does DeepResearcher support Chinese?
    Support. Users can enter questions in Chinese, and it will prioritize searching Chinese resources, and it can also handle English data.
  2. Need a GPU?
    Not mandatory, but the GPU can accelerate training and inference. the CPU can also run, just slower.
  3. How do I get the latest version?
    Run in the project directory git pull, then reinstall the dependencies to update.
May not be reproduced without permission:Chief AI Sharing Circle " DeepResearcher: driving AI to study complex problems based on reinforcement learning
en_USEnglish