General Introduction
Kokoro-FastAPI is a Docker-based FastAPI package designed to provide support for the Kokoro-82M text-to-speech model. The project supports NVIDIA GPU acceleration and provides queue processing and auto splicing to make speech output of raw grown text more efficient and coherent. The project is developed by GitHub user remsky and is publicly available on GitHub. Users can make text-to-speech requests through the API interface and get high-quality speech output for a variety of application scenarios that require speech generation.
Function List
- Provide API package for Kokoro-82M text-to-speech modeling
- Supports NVIDIA GPU acceleration to improve speech generation efficiency
- Queue processing feature to support concurrent requests
- Automatic splicing function to generate coherent speech output of long texts
- Dockerized deployment for simplified installation and configuration
- Provide sample code and documentation for developers to get started.
Using Help
Installation process
- Ensure that Docker and NVIDIA Docker support are installed.
- Clone the Kokoro-FastAPI project repository:
git clone https://github.com/remsky/Kokoro-FastAPI.git
- Go to the project directory and build the Docker image:
cd Kokoro-FastAPI docker build -t kokoro-fastapi .
- Start the Docker container:
docker run --gpus all -d -p 8000:8000 kokoro-fastapi
Using the API interface
- Access the API documentation:
Open your browser and visit http://localhost:8000/docs to view the API documentation and test the interface. - Sends a text-to-speech request:
Use a POST request to send a message to the/generate
interface sends text data, for example:curl -X POST "http://localhost:8000/generate" -H "accept: application/json" -H "Content-Type: application/json" -d '{"text": "Hello, world!"}'
- Get speech output:
Upon successful request, the URL of the generated voice file will be returned and the user can download or play the file.
sample code (computing)
The project provides sample code to help developers get started quickly:
- The test_openai_tts.py example shows how to make a text-to-speech request using the API.
Detailed Operation Procedure
- Ensure that the system meets hardware and software requirements, especially NVIDIA GPU and CUDA drivers.
- Follow the installation procedure to install and start the Kokoro-FastAPI service.
- Refer to the API documentation and sample code to send a text-to-speech request.
- Obtaining speech output files and subsequent processing and use.
With the above steps, users can easily deploy and use Kokoro-FastAPI to realize efficient text-to-speech functionality and provide high-quality speech generation services for various application scenarios.