General Introduction
PPTX2MD is an open source tool designed to convert PowerPoint PPTX files to Markdown format. Developed by GitHub user ssine, the tool supports retaining headings, lists, text formatting (such as bold, italic, color, and hyperlinks), images, and tables, among others.PPTX2MD also supports custom table of contents, fuzzy matching, and a variety of output formats, such as Markdown, Tiddlywiki's wikitext, Madoko, and Quarto.Users only need to install Python 3.10 and above and install pptx2md via pip to easily convert PPTX files to Markdown format for easy use in various Markdown editors.
Function List
- Convert PPTX file to Markdown format
- Support for retaining headings, lists, text formatting (bold, italic, color and hyperlinks)
- Support image extraction and insertion of relative paths
- Support for table conversions, including merging cells
- Support for custom directories and fuzzy matching
- Supports multiple output formats: Markdown, Tiddlywiki's wikitext, Madoko, Quarto
- Provide a variety of command line parameters, support for customizing the output file path, image directory, image width, etc.
Using Help
Installation process
- Ensure that Python 3.10 and above is installed on your system.
- Open a terminal or command prompt and run the following command to install pptx2md:
pip install pptx2md
Usage
- After the installation is complete, run the following command in the terminal or command prompt to convert PPTX files to Markdown format:
pptx2md [pptx filename]
The default output file name isout.md
The extracted images will be saved in the/img/
folder.
Detailed Function Operation
- Customized Title: By default, the tool parses all PPTX headers as one level of Markdown headers. If you need to get a hierarchical table of contents, you can predefine the list of headings in the file and use the
-t
parameter to provide the file. Example:
pptx2md [filename] -t titles.txt
Sample title file (titles.txt):
Heading 1
Heading 1.1
Heading 1.1.1
Heading 1.2
- Customize the output file path: Use
-o
parameter specifies the output file path:
pptx2md [filename] -o [output file path]
- Customized Image Catalog: Use
-i
parameter specifies the image extraction directory:
pptx2md [filename] -i [image directory]
- Setting the image width: Use
--image-width
parameter sets the maximum width of the image (in pixels):
pptx2md [filename] --image-width [width]
- Disable image extraction: Use
--disable-image
parameter disables image extraction:
pptx2md [filename] --disable-image
- Disable special character escaping: Use
--disable-escaping
parameter disables special character escaping:
pptx2md [filename] --disable-escaping
- Disable Presenter Remarks: Use
--disable-notes
Parameters disable presenter notes:
pptx2md [filename] --disable-notes
- Disable WMF format image processing: Use
--disable-wmf
parameter disables WMF format image processing (to avoid exceptions under Linux):
pptx2md [filename] --disable-wmf
- Disable color labels: Use
--disable-color
parameter disables HTML color tags:
pptx2md [filename] --disable-color
- Enabling Slide Separators: Use
--enable-slides
Parameter Enable slide separators (for converting PPTX slides to Markdown slides):
pptx2md [filename] --enable-slides
- Trying to detect multi-column slides: Use
---try-multi-column
Parameter tries to detect multiple columns of slides (slower):
pptx2md [filename] --try-multi-column
- Setting the minimum text block size: Use
--min-block-size
parameter sets the minimum number of characters for the output text block:
pptx2md [filename] --min-block-size [size]
- Export to Tiddlywiki or Madoko format: Use
--wiki
maybe---mdk
parameter outputs the corresponding markup language:
pptx2md [filename] --wiki
pptx2md [filename] --mdk