ebook2audiobook: convert e-books to audiobooks, open-source tool that supports multilingualism and voice cloning

Latest AI Resources7mos agorelease AI Sharing Circle

3.5K 00

General Introduction

ebook2audiobook is a powerful open source ebook to audiobook tool. It is able to convert multiple formats of e-books into audiobooks with full chapter markers and metadata. The tool uses Calibre for e-book format conversion , using Coqui's XTTSv2 and Fairseq for high-quality text-to-speech , support for 1124 languages , including Chinese, and provide voice cloning. The tool is equipped with an intuitive Web GUI, supports both CPU and GPU operation, and has low resource requirements, requiring only 4GB of RAM to run. Whether for personal use or batch conversion, it enables professional-grade audiobook production.

ebook2audiobook：将电子书转换为有声读物，支持多语言和语音克隆的开源工具

Online experience: https://huggingface.co/spaces/drewThomasson/ebook2audiobook

Function List

Support a variety of e-book format conversion, including epub, pdf, mobi and more than 20 formats
Automatic recognition and retention of e-book chapter structure
High-quality text-to-speech using the advanced XTTSv2 engine
Supports text-to-speech processing in 1124 languages
Provide voice cloning function, customizable reading voice
Output m4b format with full chapter information and metadata
Provide Web graphical interface, simple and intuitive operation
Supports Docker container deployment to ensure cross-platform compatibility
Optional GPU acceleration for increased processing speeds
Support batch conversion function

Using Help

Google Colab runs for free

1. Installation modalities

1.1 Using Docker (recommended)

Docker is the easiest way to install, ensuring a uniform and stable runtime environment.

CPU version run command:

docker run -it --rm -p 7860:7860 --platform=linux/amd64 athomasson2/ebook2audiobook python app.py

GPU version of the run command (requires an NVIDIA graphics card):

docker run -it --rm --gpus all -p 7860:7860 --platform=linux/amd64 athomasson2/ebook2audiobook python app.py

1.2 Local installation

Clone the code repository:

git clone https://github.com/DrewThomasson/ebook2audiobook.git

Install the dependencies:

Python 3.x
Calibre (e-book conversion tool)
FFmpeg (audio processing tool)
Python packages: tts, pydub, nltk, beautifulsoup4, ebooklib, tqdm

2. Methods of use

2.1 Graphical interface use

After launching the program, visit http://localhost:7860 via your browser
Uploading eBook files in the web interface
Selection of target language and sound file (optional)
Click to start conversion

2.2 Command line usage

Basic command format:

python app.py --headless --ebook <电子书文件路径> --language <语言代码> --voice <声音文件路径>

3. Description of important parameters

--ebook: ebook file path (required)
--language: target language code (optional, default English)
--voice: voice file path (optional, for voice cloning)
--device: choose whether to use CPU or GPU
--speed: voice speed adjustment (default 1.0)

4. Supported file formats

Input Format:

epub (recommended, best supported)
pdf
mobi
txt
Other formats: html, rtf, chm, lit, pdb, fb2, odt, etc.

Output format:

m4b (audio format with chapter markers and metadata)

5. Advanced functions

5.1 Speech cloning

Prepare 16khz or 24khz target sound sample files
Specify the sound file path during conversion
The system will read aloud using the target voice

5.2 Batch conversion

Create the input-folder directory and put the eBook file in it.
Create audiobooks output directory
Processing Multiple Files with the Batch Conversion Command

6. Resolution of common problems

Slow CPU conversion speed

Solution: Use GPU acceleration or use cloud services
Recommended: Using Hugging Face Space or Google Colab

Dependent installation issues

Recommended Docker version to avoid dependency issues
Check system compatibility and dependent versions

Audio truncation issues

Check input text formatting
Adjusting text segmentation parameters
Report specific language issues to improve support

Article copyright AI Sharing Circle All, please do not reproduce without permission.

Inter AI - AI drawing platform, supports Chinese and English bilingual text to generate images

Latest AI Resources

2mos ago

01.2K

OpenAI Agents SDK: A Python Framework for Building Multi-Intelligence Collaborative Workflows

Latest AI Resources # AI Java Open Source Projecct # Intelligent Body Development Framework

5mos ago

02.2K

MiniMax-M1 - Open Source Inference Model from MiniMax

Latest AI Resources

2mos ago

01.6K

Conch Speech (MiniMax Audio): AI tool for generating natural speech

Latest AI Resources # AI text-to-speech # AI voice cloning

2mos ago

03.7K

No comments

You must be logged in to leave a comment!

No comments...

ebook2audiobook: convert e-books to audiobooks, open-source tool that supports multilingualism and voice cloning

General Introduction

Function List

Using Help

1. Installation modalities

1.1 Using Docker (recommended)

1.2 Local installation

2. Methods of use

2.1 Graphical interface use

2.2 Command line usage

3. Description of important parameters

4. Supported file formats

5. Advanced functions

5.1 Speech cloning

5.2 Batch conversion

6. Resolution of common problems

Memary: an open-source project to enhance Agent long-term memory using knowledge graphs

MagicMirror: a lightweight native client for AI one-click face, hair and outfit changes

Related posts

Inter AI - AI drawing platform, supports Chinese and English bilingual text to generate images

OpenAI Agents SDK: A Python Framework for Building Multi-Intelligence Collaborative Workflows

MiniMax-M1 - Open Source Inference Model from MiniMax

Conch Speech (MiniMax Audio): AI tool for generating natural speech

No comments

Latest Collections

Latest Articles

ebook2audiobook: convert e-books to audiobooks, open-source tool that supports multilingualism and voice cloning

General Introduction

Function List

Using Help

1. Installation modalities

1.1 Using Docker (recommended)

1.2 Local installation

2. Methods of use

2.1 Graphical interface use

2.2 Command line usage

3. Description of important parameters

4. Supported file formats

5. Advanced functions

5.1 Speech cloning

5.2 Batch conversion

6. Resolution of common problems

Memary: an open-source project to enhance Agent long-term memory using knowledge graphs

MagicMirror: a lightweight native client for AI one-click face, hair and outfit changes

Related posts

Inter AI - AI drawing platform, supports Chinese and English bilingual text to generate images

OpenAI Agents SDK: A Python Framework for Building Multi-Intelligence Collaborative Workflows

MiniMax-M1 - Open Source Inference Model from MiniMax

Conch Speech (MiniMax Audio): AI tool for generating natural speech

No comments

Selected AI Tools

Latest Collections

Latest Articles