General Introduction
Shandu is an open source AI-based research system hosted on GitHub and created by developer jolovicdev. It utilizes LangChain and LangGraph technology, designed to provide users with automated, comprehensive and efficient subject research capabilities. Unlike traditional single-search tools, Shandu is capable of digging deeper into information through recursive exploration and parallel processing, and supports command line (CLI) and Python interface operations. Whether it's for academic research, technology exploration, or market analysis, users can quickly organize complex data, and Shandu has built-in web crawling capabilities to ensure ethical access to diverse sources of information. The project serves as an alternative to OpenAI DeepResearch, with an emphasis on lightweight, free to use, and local operation for developers and researchers.
Function List
- Automation in-depth study: Based on the queries entered by the user, the system automatically performs multi-level information mining to generate comprehensive research reports.
- Recursive Exploration: Progressively extend the study by iteratively searching and analyzing to mine hidden correlation information.
- parallel processing: Supports multi-threaded operations to get data from multiple sources at the same time to improve efficiency.
- web crawler: Built-in crawler that extracts page content and supports dynamically rendered JavaScript-heavy websites.
- Multi-Engine Search: Integrate Google, DuckDuckGo and other search engines to get diverse results.
- AI-powered search: Provides a lightweight AI search function (aisearch) to quickly answer simple questions.
- Report Generation: Organize research findings into Markdown format files for easy reading and sharing.
- Flexible Configuration: Support for adjusting search depth, breadth and number of results to meet different needs.
Using Help
Installation process
Shandu is a Python-based open source project that needs to be installed and configured in the local environment to be used. The following are the detailed installation steps:
- environmental preparation
- Make sure you have Python 3.8 or above installed on your device. This can be done with the command
python --version
Check the version. - Install the Git tool for cloning project code from GitHub, which can be downloaded from the Git website for Windows users, or installed via the package manager for Linux/Mac users (e.g.
sudo apt install git
).
- Make sure you have Python 3.8 or above installed on your device. This can be done with the command
- cloning project
- Open a terminal (CMD or PowerShell for Windows, Terminal for Mac/Linux).
- Enter the following command to clone the Shandu repository:
git clone https://github.com/jolovicdev/shandu.git
- Go to the project catalog:
cd shandu
- Installation of dependencies
- Use pip to install the Python libraries required by the project:
pip install -e .
- If you encounter dependency problems, try upgrading pip (
pip install --upgrade pip
) or use a virtual environment:python -m venv venv source venv/bin/activate # Linux/Mac venv\Scripts\activate # Windows pip install -e .
- Use pip to install the Python libraries required by the project:
- Configuring the API
- Shandu needs to configure an API key to call external services (such as a search engine). Run the following command to enter configuration mode:
shandu configure
- Enter the API key (e.g. Google API, DuckDuckGo API) when prompted. Developers can refer to theNebius StudioGet a free key for testing.
- Shandu needs to configure an API key to call external services (such as a search engine). Run the following command to enter configuration mode:
- Verify Installation
- importation
shandu --help
, if the command help message is returned, the installation was successful.
- importation
Functional operation flow
1. Run an in-depth study (research command)
This is Shandu's core feature for automating research on complex topics.
- procedure::
- Enter the research command in the terminal, for example:
shandu research "Trends in Cloud Computing" --depth 2 --breadth 4 --output report.md
--depth 2
: Set the depth of study to 2 levels (recursively explore 2 times).---breadth 4
: Expand 4 related topics per exploration.--output report.md
: Save the result as a Markdown file.
- The system automatically initiates a search and analysis, a process that may take several minutes (depending on the network and subject complexity).
3. When finished, open thereport.md
View the research report, which includes an overview of the topic, key findings, and reference links.
- Enter the research command in the terminal, for example:
- Usage Scenarios: For academic or technical research that requires comprehensive information, such as "AI in Healthcare".
2. Quick AI search (aisearch command)
For answering simple questions or getting instant answers.
- procedure::
- Enter a quick search command, for example:
shandu aisearch "Who is the current president of the United States?" --detailed
--detailed
: Return detailed answers rather than short replies.
- The system will call the AI model to return results such as, "As of March 3, 2025, the President of the United States is Donald Trump, who begins his second term on January 20, 2025."
- Enter a quick search command, for example:
- Usage Scenarios: Ideal for quick access to facts, such as historical events, information about people, etc.
3. Web page scraping (scrape command)
Used to extract content from a specific web page.
- procedure::
- Enter the grab command, for example:
shandu scrape "https://example.com" --dynamic
--dynamic
: Enable dynamic rendering for JavaScript-driven sites.
- The system returns the extracted text content, which can be saved to a file through the pipeline:
shandu scrape "https://example.com" --dynamic > output.txt
- Enter the grab command, for example:
- Usage Scenarios: Analyze the content of news pages, technical blogs, or official product websites.
4. Configuring the search engine
Users can customize search sources to optimize results.
- procedure::
- Enter the command to specify a search engine:
shandu search "artificial intelligence ethics" --engines "google,duckduckgo" --max-results 15
--engines
: Specify Google and DuckDuckGo searches.--max-results 15
: Limit the number of results returned to 15.
- View a list of returned search results that can be used for subsequent in-depth research.
- Enter the command to specify a search engine:
- Usage Scenarios: Access to diverse sources of information and avoidance of single-engine bias.
caveat
- network requirement: Shandu relies on an internet connection to ensure a stable network at runtime.
- Ethical Compliance: Comply with the robots.txt rules of the target site when crawling web pages to avoid frequent requests leading to IP blocking.
- performance optimization: Complex studies may take up more memory and are recommended to be run on devices with higher configurations (e.g. 8GB+ RAM).
- Debugging Issues: If you encounter an error, you can view the log (saved by default as
shandu.log
) or submit an Issue at GitHub.