AI Personal Learning
and practical guidance

MockingBird: Fast Voice Cloning and Model Training, Text-to-Speech based on xtts v2 Implementation

General Introduction

MockingBird is an open source project that aims to achieve fast speech cloning and text-to-speech through AI technology. Users only need to provide 5 seconds of voice samples to generate any voice content. The project supports a variety of Chinese datasets and runs well on Windows and Linux systems.MockingBird uses the PyTorch framework and provides easy-to-use tools and detailed installation guides for developers and researchers.

MockingBird: fast voice cloning, text-to-speech based on xtts_v2 implementation-1


 

Function List

  • Speech Cloning: Generate arbitrary speech content from 5-second speech samples
  • Text-to-speech: Input text to generate corresponding speech
  • Multi-language support: Supports Mandarin and multiple Chinese datasets
  • Cross-platform operation: compatible with Windows and Linux systems
  • Real-time processing: provides real-time speech generation
  • Open source code: the code is open to facilitate secondary development and research

 

Using Help

Installation process

  1. environmental preparation::
    • Install Python 3.7 or later.
    • Install PyTorch (version 1.9.0 recommended).
    • Install ffmpeg.
  2. Download Project::
    • Open the MockingBird project address, click the green "Code" button and select "Download ZIP" to download the project file.
    • Or use the git command to download it:git clone https://github.com/babysor/MockingBird.git
  3. Installation of dependencies::
    • Go to the project directory and run pip install -r requirements.txt Install the necessary Python packages.
    • If desired, you can use conda to create a virtual environment and install dependencies:conda env create -n env_name -f env.yml, and then activate the environment:conda activate env_nameThe
  4. phonetic transcription model

In order to reduce the size of the main file does not contain the sound to sound model, if you need to download separately, click to go toDownload model (3G)

 

Usage Process

  1. Running the Toolbox::
    • (of a computer) run demo_toolbox.pyto open the Toolbox screen.
    • Select the speech sample file in the toolbox, enter the text content and click the Generate button to generate the corresponding speech file.
  2. training model::
    • If you need to train your own model, you can follow the training tutorial in the program.
    • Download and prepare the training dataset, run train.py Start training.
    • Chinese help file for training models
  3. remote call::
    • MockingBird provides a web server function that allows you to use the generated speech results by remote invocation.
    • Configure and start the web server to be called using the API interface.

common problems

  • installation failure: Ensure that your version of Python meets the requirements and that you are aware of version compatibility when installing PyTorch.
  • voice quality: The quality of speech samples and the richness of the training dataset affect the effectiveness of the generated speech, and it is recommended to use high-quality speech samples and diverse datasets for training.

 

Windows pre-packaged download (3.7G/with text-to-sound model)

Chief AI Sharing CircleThis content has been hidden by the author, please enter the verification code to view the content
Captcha:
Please pay attention to this site WeChat public number, reply "CAPTCHA, a type of challenge-response test (computing)", get the verification code. Search in WeChat for "Chief AI Sharing Circle"or"Looks-AI"or WeChat scanning the right side of the QR code can be concerned about this site WeChat public number.

AI Easy Learning

The layman's guide to getting started with AI

Help you learn how to utilize AI tools at a low cost and from a zero base.AI, like office software, is an essential skill for everyone. Mastering AI will give you an edge in your job search and half the effort in your future work and studies.

View Details>
May not be reproduced without permission:Chief AI Sharing Circle " MockingBird: Fast Voice Cloning and Model Training, Text-to-Speech based on xtts v2 Implementation

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish