General Introduction
Retrieval based Voice Conversion WebUI is a simple and easy-to-use VITS-based voice conversion framework, which can realize voice conversion between any speakers, including song cover and real-time voice change. It features low latency, excellent voice conversion effect, small amount of data training, etc. It supports N-card, A-card and I-card acceleration, provides web interface and real-time voice conversion interface, and can also call UVR5 model to quickly separate the human voice from the accompaniment, and use the most advanced human voice pitch extraction algorithm, RMVPE, to eliminate the problem of mute voice.
The bottom model is trained using close to 50 hours of open-source, high-quality VCTK training set, with no copyright concerns.
Look for RVCv3's bottom model with bigger parameters, bigger data, better results, essentially equal inference speed, and less training data required.
Function List
- Train your own speech conversion model with as little as 10 minutes of speech data
- Supports multiple sample rates and tones using pre-trained speech conversion models
- Speech conversion using a web interface or a real-time voice-altering interface with end-to-end low latency support
- Uses UVR5 modeling to separate vocals and backing tracks, supports multiple audio file formats
- Use RMVPE algorithm to extract vocal pitch, support pytorch/onnx/DirectML
Using Help
- Download or clone this repository and install the required dependencies and pre-models
- Run go-web.bat or go-realtime-gui.bat and select the action you want to perform
- According to the interface prompts, select the input and output voice files or devices, adjust parameters and options
- Click start or stop and enjoy voice conversion!