General Introduction
Deep Searcher is a combination of powerful large language models (such as the DeepSeek and OpenAI) and vector databases (e.g., Milvus) are tools designed to search, evaluate, and reason based on private data, providing highly accurate answers and comprehensive reports. The project is applicable to enterprise knowledge management, intelligent Q&A systems, and information retrieval scenarios.Deep Searcher supports a wide range of embedding models and large language models, and is able to manage vector databases to ensure efficient retrieval and secure utilization of data.
Function List
- Private Data Search: Maximize the use of internal enterprise data and ensure data security.
- Vector Database Management: Supports vector databases such as Milvus, which allows data partitioning for more efficient retrieval.
- Flexible embedding options: Compatible with multiple embedding models for easy selection of the best solution.
- Multiple large language model support: Support for big models like DeepSeek, OpenAI, etc. for smart Q&A and content generation.
- Document Loader: Local file loading is supported and web crawling will be added in the future.
Using Help
Installation process
- Cloning Warehouse:
git clone https://github.com/zilliztech/deep-searcher.git
- Create a Python virtual environment (recommended):
python3 -m venv .venv
source .venv/bin/activate
- Install the dependencies:
cd deep-searcher
pip install -e .
- Configuring LLM or Milvus: Edit
examples/example1.py
file to configure LLM or Milvus as needed. - Prepare the data and run the example:
python examples/example1.py
Instructions for use
- Configuring LLM: In
deepsearcher.configuration
module, use theset_provider_config
method to configure the LLM. for example. configure the OpenAI model:
config.set_provider_config("llm", "OpenAI", {"model": "gpt-4o"})
- Load Local Data: Use
deepsearcher.offline_loading
in the moduleload_from_local_files
method to load local data:
load_from_local_files(paths_or_directory="your_local_path")
- Query Data: Use
deepsearcher.online_query
in the modulequery
method for querying:
result = query("Write a report about xxx.")
Detailed function operation flow
- Private Data Search::
- Maximize the use of data within the enterprise while ensuring data security.
- Online content can be integrated when more accurate answers are needed.
- Vector Database Management::
- Supports vector databases such as Milvus, which allows data partitioning for more efficient retrieval.
- Support for more vector databases (e.g. FAISS) is planned for the future.
- Flexible embedding options::
- Compatible with multiple embedded models for easy selection of the best solution.
- Multiple large language model support::
- Supports big models like DeepSeek, OpenAI, etc. for smart Q&A and content generation.
- Document Loader::
- Local file loading is supported and web crawling will be added in the future.