General Introduction
FastAPI DocGPT is a FastAPI-based document quiz system that allows users to upload PDF files and quiz them based on the document content. The system uses OpenAI's embedding technology to embed document content into a vector database, Qdrant, to realize the intelligent Q&A function. Users can upload documents and ask questions through the API interface, and the system will return intelligent answers based on the document content.
Function List
- PDF Upload: Users can upload PDF files, which are processed and stored in the vector database.
- question and answer system: Users can ask questions based on the uploaded PDF content and the system will return intelligent answers.
- API Documentation: Provides auto-generated API documentation through Swagger for developers' convenience.
- Cross-domain resource sharing: Supports CORS, allowing front-end requests from different domains.
Using Help
Installation and Setup
- clone warehouse
git clone https://github.com/shaheryaryousaf/fastapi-docgpt cd fastapi-docgpt
- Setting up a virtual environment
python3 -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate
- Installation of dependencies
pip install -r requirements.txt
- Configuring Environment Variables In the project root directory, create the
.env
file and add API keys for OpenAI and Qdrant:OPENAI_API_KEY=your-openai-api-key QDRANT_URL=your-qdrant-url QDRANT_API_KEY=your-qdrant-api-key
Guidelines for use
- Launching the FastAPI application
uvicorn app:app --reload
This will start the FastAPI application with the
http://127.0.0.1:8000
Provision of services. - Upload PDF files pass (a bill or inspection etc)
/upload-pdf/
Endpoints upload PDF files:curl -X POST "http://127.0.0.1:8000/upload-pdf/" -F "file=@yourfile.pdf"
The system processes the PDF file and embeds its contents into the vector database.
- put pass (a bill or inspection etc)
/ask-question/
Endpoints raise questions:curl -X POST "http://127.0.0.1:8000/ask-question/" -H "Content-Type: application/json" -d '{"question": "your question"}'
The system returns intelligent answers based on the content of the document.
Project structure
app.py
: The main FastAPI application file containing the API endpoints for the PDF upload and Q&A system.utils.py
: Contains utility functions for processing PDF files, sending embeddings to vector databases, and retrieving answers from embeddings..env
File: Manages API keys for OpenAI and Qdrant.
dependency library (computing)
- FastAPI: Used to build Web APIs.
- Qdrant Client: Used to store and retrieve document embeds.
- LangChain: For processing PDF and embedding.
- OpenAI: for generating embeddings and AI model responses.
- PyPDFLoader: Used to extract text from PDF files.
- CORS Middleware: Handles cross-domain resource sharing (CORS), allowing front-end requests from different domains.
- dotenv: Manage environment variables (such as API keys).