Kokoro TTS API: Dockerized FastAPI wrapper for fast text-to-speech (Kokoro-82M model)

Latest AI Resources7mos agorelease AI Sharing Circle

3.7K 00

General Introduction

Kokoro-FastAPI is a Docker-based FastAPI package designed to provide support for the Kokoro-82M text-to-speech model. The project supports NVIDIA GPU acceleration and provides queue processing and auto splicing to make speech output of raw grown text more efficient and coherent. The project is developed by GitHub user remsky and is publicly available on GitHub. Users can make text-to-speech requests through the API interface and get high-quality speech output for a variety of application scenarios that require speech generation.

Kokoro TTS API：快速文本转语音的Docker化FastAPI封装（Kokoro-82M模型）

Function List

Provide API package for Kokoro-82M text-to-speech modeling
Supports NVIDIA GPU acceleration to improve speech generation efficiency
Queue processing feature to support concurrent requests
Automatic splicing function to generate coherent speech output of long texts
Dockerized deployment for simplified installation and configuration
Provide sample code and documentation for developers to get started.

Using Help

Installation process

Ensure that Docker and NVIDIA Docker support are installed.

Clone the Kokoro-FastAPI project repository:

git clone https://github.com/remsky/Kokoro-FastAPI.git

Go to the project directory and build the Docker image:
```
cd Kokoro-FastAPI
docker build -t kokoro-fastapi .
```

Start the Docker container:

docker run --gpus all -d -p 8000:8000 kokoro-fastapi

Using the API interface

Access the API documentation:
Open your browser and visit http://localhost:8000/docs to view the API documentation and test the interface.

Sends a text-to-speech request:
Use a POST request to send a message to the/generateinterface sends text data, for example:

curl -X POST "http://localhost:8000/generate" -H "accept: application/json" -H "Content-Type: application/json" -d '{"text": "你好，世界！"}'

Get speech output:
Upon successful request, the URL of the generated voice file will be returned and the user can download or play the file.

sample code (computing)

The project provides sample code to help developers get started quickly:

The test_openai_tts.py example shows how to make a text-to-speech request using the API.

Detailed Operation Procedure

Ensure that the system meets hardware and software requirements, especially NVIDIA GPU and CUDA drivers.
Follow the installation procedure to install and start the Kokoro-FastAPI service.
Refer to the API documentation and sample code to send a text-to-speech request.
Obtaining speech output files and subsequent processing and use.

With the above steps, users can easily deploy and use Kokoro-FastAPI to realize efficient text-to-speech functionality and provide high-quality speech generation services for various application scenarios.