General Introduction
MegaParse is a powerful and versatile document parsing tool designed to optimize data processing for the Large Language Model (LLM). Whether you are working with text, PDF, PowerPoint presentations or Word documents, MegaParse makes it easy and ensures that no information is lost in the parsing process. Developed by QuivrHQ, the tool is open source and free to use, and is designed to provide fast and efficient file parsing services for a wide range of file formats, including text, PDF, PowerPoint, Excel, CSV and Word documents.
Function List
- multifunctional parserSupport for multiple file types including text, PDF, PowerPoint, Excel, CSV and Word documents.
- No information lost: Ensure that no information is lost in the parsing process.
- fast and efficient: The design core focuses on speed and efficiency.
- Open source and free: Open source project, free to use.
- Support for multiple contents: Support for parsing tables, table of contents, headers, footers and images.
Three parsing modes.
- UnstructuredParser
- Visual parser (MegaParseVision) - supports multimodal models such as GPT-4V and Claude 3
- LlamaParser - Enhanced parsing capabilities via Llama Cloud
Performance.
According to the benchmark test, the similarity ratio of MegaParseVision mode reaches 0.87, which is the best parsing mode in terms of performance.
Main application scenarios.
- Need to import various documents into LLM system for processing
- Scenarios that require document formatting and content integrity to be maintained
- Batch document processing tasks
The project is under active development, with plans to add more features such as.
- Improvements to the table inspector
- Add modular post-processing
- Add structured output support
Using Help
Installation process
- Installing MegaParse::
pip install megaparse
- Configuring API Keys: Place your OpenAI or Anthropic The API key is added to the
.env
Documentation. - Installation of dependencies::
- For images and PDF files, install
poppler
cap (a poem)tesseract
The - If you are using a Mac, you will also need to install the
libmagic
::brew install libmagic
- For images and PDF files, install
Using MegaParse
- Import MegaParse::
from megaparse import MegaParse from langchain_openai import ChatOpenAI from megaparse.parser.unstructured_parser import UnstructuredParser parser = UnstructuredParser() megaparse = MegaParse(parser) response = megaparse.load(". /test.pdf") print(response) megaparse.save(". /test.md")
- Using MegaParse Vision::
from megaparse import MegaParse from langchain_openai import ChatOpenAI from megaparse.parser.megaparse_vision import MegaParseVision model = ChatOpenAI(model="gpt-4o", api_key=os.getenv("OPENAI_API_KEY")) parser = MegaParseVision(model=model) megaparse = MegaParse(parser) response = megaparse.load(". /test.pdf") print(response) megaparse.save(". /test.md")
Boosting results with LlamaParse
- Create a Llama Cloud account and get an API keyThe
- Change parser to LlamaParser::
from megaparse import MegaParse from langchain_openai import ChatOpenAI from megaparse.parser.llama_parser import LlamaParser parser = LlamaParser(api_key=os.getenv("LLAMA_CLOUD_API_KEY")) megaparse = MegaParse(parser) response = megaparse.load(". /test.pdf") print(response) megaparse.save(". /test.md")
Used as an API
- Using MakeFile::
Run it in the project root directory:make dev
- Accessing Documents::
Open your browser to accesslocalhost:8000/docs
View different endpoint information.