Wav2Lip: open source high-precision mouth synchronization generation tool (recommended)

Latest AI Resources7mos agoupdate AI Sharing Circle

2.8K 00

General Introduction

Wav2Lip is an open-source, high-precision lip sync generation tool designed to accurately synchronize arbitrary audio with lip sync in video. Released at ACM Multimedia 2020 by Rudrabha Mukhopadhyay et al, the tool leverages advanced AI techniques to enable high-quality mouth synchronization in a variety of environments.Suitable for research, academic, and personal use, Wav2Lip provides complete training code, inference code, and pre-trained models.

The project hasn't been iterated in a long time, and this is a recently optimized version:Easy-Wav2Lip: a tool for high quality video lip sync, optimized for Wav2Lip . For more information on how Wav2Lip integrates you can refer to the Translation Starter: Open Source Video Content Translation Synchronization Tool|Language Conversion|Lip Synchronization The

Wav2Lip in Sync Labs Free hosting is offered.

Colab Notes:
https://colab.research.google.com/drive/1IjFW1cLevs6Ouyu4Yht4mnR4yeuMqO7Y#scrollTo=Qgo-oaI3JU2u
https://colab.research.google.com/drive/1tZpDWXz49W6wDcTprANRGLo2D_EbD5J8?usp=sharing

Function List

High-precision lip sync : Accurately synchronize any audio with the lip sync in the video.
Multi-language support: Works with a variety of languages and sounds, including CGI faces and synthesized sounds.
Open source and free : The code is completely public, and users are free to use and modify it.
Interactive Demo: Provides an online demo where users can upload video and audio files to experience.
Pre-training models: Provide a variety of pre-training models, users can directly use or secondary training.
Complete training code: Includes training code for the mouth synchronization discriminator and the Wav2Lip model.

Using Help

Installation process

Cloning Warehouse :
bash copy

git clonehttps://github.com/Rudrabha/Wav2Lip

Install dependencies :
bash copy

pip install -r requirements.txt

Download pre-trained model: Download the pre-trained model to a specified directory, e.g. face_detection/detection/sfd/s3fd.pthThe
Run the inference code :
bash copy

python inference.py --checkpoint_path <ckpt> --face <video.mp4> --audio <an-audio-source>

Usage Process

To access the local server: Open the http://localhost:3000The
Input Tip : Enter the description of the image you want to generate in the input box and the image will be generated in real time.
Viewing and Downloading Images : The generated images are displayed on the page and a download button will be added in a future version.
Use Consistency Mode : Enable Consistency Mode to generate consistent images, keeping the background or main objects consistent.
View Image History : Use the Image History feature to view all generated images and navigate between them.

Advanced Features

Enhanced Tips: Optimize the generated results with enhanced tips options.
Select Model : Select different AI models according to your needs.
Customized development : As Wav2Lip is open source, users can do secondary development according to their own needs.