SadTalker: Make Photos Talk | Mouth Synchronized Audio | Synthesized Mouth Synchronized Video | Free Digital People

Latest AI Resources6mos agoupdate AI Sharing Circle

3.2K 00

General Introduction

SadTalker is an open source tool that combines a single still portrait photo with an audio file to create realistic talking head videos for a wide range of scenarios such as personalized messages, educational content, and more. The revolutionary use of 3D modeling technologies such as ExpNet and PoseVAE excels in capturing subtle facial expressions and head movements. Users can utilize SadTalker technology in both personal and commercial projects such as messaging, teaching or marketing.

Recommended enhancements:SVLS: SadTalker Enhanced to Generate Digital People Using Portrait VideoThe newest addition to the program is a video-generated digital person, upgraded from a photo-generated digital person to a video-generated digital person, and made the digital person's speech smoother through frame insertion technology.

Function List

Synchronize facial movements and expressions using audio

Convert Still Portrait Photos to Motion Video
Synchronized lip-sync animation of audio files

Supports full body mode and expression enhancer function

Provides a configurable WebUI interface

The technology can be used through Discord integration

Provide detailed development and usage documentation

Support for Windows, Linux/Unix and macOS

Using Help

Install the required Anaconda, Python and git
Follow the documentation to install the environment and download the model
Animation generation using native WebUI or command line interface

Attention:

Choose a clear, front-facing portrait photo for best results
Use clear audio files to ensure accurate lip syncing

Depending on the resources available on the web, here are the basic steps for using SadTalker:

environmental preparation:
- If you don't have a Python environment, install Anaconda.
- Install NVIDIA cuda-toolkit to use GPU acceleration on computers with NVIDIA graphics cards. Processing will be slower if only the CPU is used.
Model and library installation:
- Download and install the required model and library files. These files usually need to be placed in a specific directory, for example./checkpoints/maybe./gfpgan/weights/The
FFMPEG Video Library Installation:
- Install FFMPEG, which is necessary to generate videos.
TTS Voice Conversion Library Installation:
- Install the edge-tts library to convert text to speech.
Using the Web UI:
- By clicking on thewebui.batLaunch SadTalker's Web UI.
- In the Web UI, upload the image to the specified area and set the parameters when converting the digital person.
- After generating the digitizer video, you can view the results in the interface.
Command Line Usage:
- If more optionality is sought, SadTalker can be used by way of command line scripting.
- When using the command line, you can runtask.shfile to easily generate tasks.
caveat:
- When using it, make sure the image is of good quality for best results.
- If an error is encountered, such aslibiomp5md.dllConflicts, try to find out what is happening in theapp.pySetting environment variables inKMP_DUPLICATE_LIB_OK=TRUEto resolve.

The above steps are summarized based on tutorials and user experience on the web, and the exact operation may vary. It is recommended that you refer to the official SadTalker documentation and community tutorials for the most up-to-date and detailed instructions.