MarkPDFDown: based on the multimodal model will be converted to PDF Markdown file

Latest AI Resources5mos agorelease AI Sharing Circle

1.5K 00

General Introduction

MarkPDFDown is an open source tool. It utilizes a multimodal large language model to convert PDF files into Markdown format. The developer is GitHub user jorben. The goal of this tool is simple: to make PDF documents easier to edit and share. It recognizes headings, lists, tables, and other structures in a document and produces a neatly formatted Markdown file. The project is written in Python and is suitable for users who need to process PDF files and convert them to text format. The current version needs to rely on OpenAI's API, users have to prepare their own API key. markPDFDown open source code on GitHub, welcome to participate in the improvement.

Function List

Convert PDF files to Markdown format, preserving document structure.
Support for recognizing headings, paragraphs, lists, tables and other elements.
Understanding PDF content through multimodal macromodeling ensures accurate conversion results.
Provides command line operation, supports batch processing of PDF files.
Open source and free of charge, users can customize the code.

Using Help

MarkPDFDown is a command-line tool which requires you to install and configure the environment on your computer in order to use it. Below are the detailed installation and operation steps, suitable for novice users can also easily get started.

Installation process

Preparing the environment
You will need a computer with Python 3.9. If not, download and install Python first.
Open a terminal and enter the following command to create a virtual environment:

conda create -n markpdfdown python=3.9

Then activate the environment:

conda activate markpdfdown

Download Code
Clone MarkPDFDown's GitHub repository by typing the command in the terminal:

git clone https://github.com/jorben/markpdfdown.git

Go to the project folder:

cd markpdfdown

Installation of dependencies
The project requires some Python library support. Run the following command to install them:

pip install -r requirements.txt

Configuring API Keys
MarkPDFDown uses OpenAI's multimodal model and requires an API key. First go to the OpenAI website to register an account and get the key.
Set the key in the terminal:

export OPENAI_API_KEY=<你的API密钥>

If you want to change the model or API address, you can set it again:

export OPENAI_DEFAULT_MODEL=<你的模型名>
export OPENAI_API_BASE=<你的API地址>

Verify Installation
importation python main.py --helpIf a help message is displayed, the installation was successful.

How to use

Once installed, the operation of MarkPDFDown is very simple and mainly done through the command line. The following are the specific steps.

Convert entire PDF files

Suppose you have a PDF file such as tests/input.pdfIf you want to convert it to a Markdown file output.md. Type in the terminal:

python main.py < tests/input.pdf > output.md

After running theoutput.md It will appear in the current folder with the converted Markdown content.

Convert specific pages of a PDF

If you want to convert only certain pages, such as pages 2 through 5, enter:

python main.py 2 5 < tests/input.pdf > output.md

The first number is the start page and the second is the end page. Page numbers are counted from 1.

Running with Docker

Don't want to install a Python environment? Make sure you have Docker on your computer and run it:

docker run -i -e OPENAI_API_KEY=<你的API密钥> jorben/markpdfdown < tests/input.pdf > output.md

This converts the file directly through the Docker container.

Functional operation details

Core Functions: PDF to Markdown
Drag the PDF file to the command line window, or directly enter the file path, the tool will automatically analyze the content. The title will become #,## etc., the list is represented by the - indicates that the table is output in Markdown table format.
For example, a PDF with the title "Introduction" and the body "This is the content" may be converted:

# 简介
这是内容

batch file
If there are a lot of PDF files, you can write a script to call the command in a loop. For example, on Linux:

for file in *.pdf; do python main.py < "$file" > "${file%.pdf}.md"; done

Debugging and Improvement
Conversion results not satisfactory? Ask a question on GitHub or change the code yourself. The project is written in Python, and the logic is all in the main.py Mile.

caveat

The file path cannot have Chinese characters, otherwise it may report an error.
The API key should be kept secret and not disclosed to others.
Large files may take more time to process, ensuring a stable network.

application scenario

academic research
Students or researchers often need to convert thesis PDF to Markdown for easy note-taking or sharing.MarkPDFDown preserves the structure of the thesis, such as headings and tables, for direct editing in Markdown.
Documentation
Companies have a lot of PDF instructions or reports that they want to convert to Markdown archives. You can use this tool to batch convert them and then upload them to GitHub or Notion.
technical writing
When writing technical blogs, you need to quote PDF materials. Convert it directly and paste it into a Markdown editor, saving you the trouble of organizing it manually.

QA

Do I need to network?
Yes. The tool relies on OpenAI's API and must be networked to work.
Does it support Chinese PDF?
Support. As long as the PDF is in text format (not a scanned image), Chinese content can be converted properly.
What if there is a conversion error?
Check if the API key is correct, or if the PDF file is corrupt. If that doesn't work, go to GitHub and raise an issue.
Can I use it offline?
Not right now. Local models may be supported in the future, but for now it's going to have to be OpenAI's service.