General Introduction
MockingBird is an open source project that aims to achieve fast speech cloning and text-to-speech through AI technology. Users only need to provide 5 seconds of voice samples to generate any voice content. The project supports a variety of Chinese datasets and runs well on Windows and Linux systems.MockingBird uses the PyTorch framework and provides easy-to-use tools and detailed installation guides for developers and researchers.
Function List
- Speech Cloning: Generate arbitrary speech content from 5-second speech samples
- Text-to-speech: Input text to generate corresponding speech
- Multi-language support: Supports Mandarin and multiple Chinese datasets
- Cross-platform operation: compatible with Windows and Linux systems
- Real-time processing: provides real-time speech generation
- Open source code: the code is open to facilitate secondary development and research
Using Help
Installation process
- environmental preparation::
- Install Python 3.7 or later.
- Install PyTorch (version 1.9.0 recommended).
- Install ffmpeg.
- Download Project::
- Open the MockingBird project address, click the green "Code" button and select "Download ZIP" to download the project file.
- Or use the git command to download it:
git clone https://github.com/babysor/MockingBird.git
- Installation of dependencies::
- Go to the project directory and run
pip install -r requirements.txt
Install the necessary Python packages. - If desired, you can use conda to create a virtual environment and install dependencies:
conda env create -n env_name -f env.yml
, and then activate the environment:conda activate env_name
The
- Go to the project directory and run
- phonetic transcription model
In order to reduce the size of the main file does not contain the sound to sound model, if you need to download separately, click to go toDownload model (3G)
Usage Process
- Running the Toolbox::
- (of a computer) run
demo_toolbox.py
to open the Toolbox screen. - Select the speech sample file in the toolbox, enter the text content and click the Generate button to generate the corresponding speech file.
- (of a computer) run
- training model::
- If you need to train your own model, you can follow the training tutorial in the program.
- Download and prepare the training dataset, run
train.py
Start training. - Chinese help file for training models
- remote call::
- MockingBird provides a web server function that allows you to use the generated speech results by remote invocation.
- Configure and start the web server to be called using the API interface.
common problems
- installation failure: Ensure that your version of Python meets the requirements and that you are aware of version compatibility when installing PyTorch.
- voice quality: The quality of speech samples and the richness of the training dataset affect the effectiveness of the generated speech, and it is recommended to use high-quality speech samples and diverse datasets for training.