General Introduction
ebook2audiobook is a powerful open source ebook to audiobook tool. It is able to convert multiple formats of e-books into audiobooks with full chapter markers and metadata. The tool uses Calibre for e-book format conversion , using Coqui's XTTSv2 and Fairseq for high-quality text-to-speech , support for 1124 languages , including Chinese, and provide voice cloning. The tool is equipped with an intuitive Web GUI, supports both CPU and GPU operation, and has low resource requirements, requiring only 4GB of RAM to run. Whether for personal use or batch conversion, it enables professional-grade audiobook production.
Function List
- Support a variety of e-book format conversion, including epub, pdf, mobi and more than 20 formats
- Automatic recognition and retention of e-book chapter structure
- High-quality text-to-speech using the advanced XTTSv2 engine
- Supports text-to-speech processing in 1124 languages
- Provide voice cloning function, customizable reading voice
- Output m4b format with full chapter information and metadata
- Provide Web graphical interface, simple and intuitive operation
- Supports Docker container deployment to ensure cross-platform compatibility
- Optional GPU acceleration for increased processing speeds
- Support batch conversion function
Using Help
1. Installation modalities
1.1 Using Docker (recommended)
Docker is the easiest way to install, ensuring a uniform and stable runtime environment.
CPU version run command:
docker run -it --rm -p 7860:7860 --platform=linux/amd64 athomasson2/ebook2audiobook python app.py
GPU version of the run command (requires an NVIDIA graphics card):
docker run -it --rm --gpus all -p 7860:7860 --platform=linux/amd64 athomasson2/ebook2audiobook python app.py
1.2 Local installation
- Clone the code repository:
git clone https://github.com/DrewThomasson/ebook2audiobook.git
- Install the dependencies:
- Python 3.x
- Calibre (e-book conversion tool)
- FFmpeg (audio processing tool)
- Python packages: tts, pydub, nltk, beautifulsoup4, ebooklib, tqdm
2. Methods of use
2.1 Graphical interface use
- After launching the program, visit http://localhost:7860 via your browser
- Uploading eBook files in the web interface
- Selection of target language and sound file (optional)
- Click to start conversion
2.2 Command line usage
Basic command format:
python app.py --headless --ebook --language --voice
3. Description of important parameters
- --ebook: ebook file path (required)
- --language: target language code (optional, default English)
- --voice: voice file path (optional, for voice cloning)
- --device: choose whether to use CPU or GPU
- --speed: voice speed adjustment (default 1.0)
4. Supported file formats
Input Format:
- epub (recommended, best supported)
- mobi
- txt
- Other formats: html, rtf, chm, lit, pdb, fb2, odt, etc.
Output format:
- m4b (audio format with chapter markers and metadata)
5. Advanced functions
5.1 Speech cloning
- Prepare 16khz or 24khz target sound sample files
- Specify the sound file path during conversion
- The system will read aloud using the target voice
5.2 Batch conversion
- Create the input-folder directory and put the eBook file in it.
- Create audiobooks output directory
- Processing Multiple Files with the Batch Conversion Command
6. Resolution of common problems
- Slow CPU conversion speed
- Solution: Use GPU acceleration or use cloud services
- Recommended: Using Hugging Face Space or Google Colab
- Dependent installation issues
- Recommended Docker version to avoid dependency issues
- Check system compatibility and dependent versions
- Audio truncation issues
- Check input text formatting
- Adjusting text segmentation parameters
- Report specific language issues to improve support