AI Personal Learning
and practical guidance

PrivateGPT: Document Q&A System with Fully Localized RAG Processing Flow

General Introduction

PrivateGPT is an AI project available for production environments that allows users to quiz documents using Large Language Models (LLMs) without an internet connection. The project ensures data privacy for 100%, and all data is processed in the user's execution environment without disclosure.PrivateGPT was developed by the Zylon team to provide an API to support building private, context-aware AI applications. The project follows and extends the OpenAI API standard to support both normal and streaming responses.PrivateGPT is suitable for domains that require a high degree of data privacy, such as healthcare and law.

Similar projects:Kotaemon: simple to deploy open source multimodal document quiz tool


PrivateGPT: Document interaction using GPT technology to ensure data privacy-1

 

Function List

  • Document ingestion: Manage document parsing, splitting, metadata extraction, embedding generation and storage.
  • Chat & Finish: Use the context of the ingested document for conversation and task completion.
  • Embedding Generation: Generate embedding based on text.
  • Context Block Search: ingests the most relevant blocks of text in a document based on the query returns.
  • Gradio UI Client: Provides a working client for testing the API.
  • Tools for batch model download scripts, ingestion scripts, document folder monitoring, and more.

 

Using Help

Installation process

  1. clone warehouse: First, clone PrivateGPT's GitHub repository.
   git clone https://github.com/zylon-ai/private-gpt.git
cd private-gpt
  1. Installation of dependencies: UsepipInstall the required Python dependencies.
   pip install -r requirements.txt
  1. Configuration environment: Configure environment variables and setup files as needed.
   cp settings-example.yaml settings.yaml
# Edit the settings.yaml file to configure the relevant parameters
  1. Starting services: Start the service using Docker.
   docker-compose up -d

Using the Documentation Q&A Function

  1. document ingestion: Place the documents to be processed in the specified folder and run the ingestion script.
   python scripts/ingest.py --input-folder path/to/documents
  1. Q&A Interaction: Use the Gradio UI client for Q&A interactions.
   python app.py
# Open your browser to http://localhost:7860

High-Level API Usage

  1. Document parsing and embedding generation: Document parsing and embedding generation using high-level APIs.
   from private_gpt import HighLevelAPI
api = HighLevelAPI()
api.ingest_documents("path/to/documents")
  1. Context search and answer generation: Context retrieval and answer generation using high-level APIs.
   response = api.chat("your question")
print(response)

Low-Level API Usage

  1. Embedding Generation: Generate text embedding using the low-level API.
   from private_gpt import LowLevelAPI
api = LowLevelAPI()
embedding = api.generate_embedding("your text")
  1. context block search: Context block retrieval using low-level APIs.
   chunks = api.retrieve_chunks("your query")
print(chunks)

Toolset Usage

  1. Batch Model Download: Use the Bulk Model Download script to download the required models.
   python scripts/download_models.py
  1. Documents folder monitoring: Automatically ingest new documents using the Document Folder Monitor tool.
   python scripts/watch_folder.py --folder path/to/documents
May not be reproduced without permission:Chief AI Sharing Circle " PrivateGPT: Document Q&A System with Fully Localized RAG Processing Flow

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish