General Introduction
DH_live is a real-time live digital human project based on sample less learning, aiming to provide users with a smooth and interactive live streaming experience. The project supports NVIDIA 30 and 40 series graphics cards and is capable of running in real-time at 25+ fps. Users can create and use digital people in simple steps, enabling audio-driven video generation and real-time interaction.
Function List
- Real-time performance: Supports NVIDIA 30 and 40 series graphics cards for a smooth real-time interactive experience.
- Few-shot learning: The system is able to learn from a small number of examples to generate realistic responses.
- Video Preparation: Prepare the video data using the data_preparation script.
- Audio Driver: Supports driving digital people through audio files to generate synchronized video.
- Real-time microphone input: Supports real-time operation via microphone.
Using Help
Environment creation and model file decompression
- Create a virtual environment and activate it:
conda create -n dh_live python=3.12 conda activate dh_live
- Install the dependencies:
pip install torch --index-url https://download.pytorch.org/whl/cu124 pip install -r requirements.txt
- Unzip the model file:
- Linux.
cd checkpoint cat render.pth.gz.001 render.pth.gz.002 > render.pth.gz gzip -d -c render.pth.gz > render.pth
- Windows: Extract the checkpoint file using 7zip or WinRAR.
- Linux.
Video preparation
- utilization
data_preparation.py
Script preparation video:python data_preparation.py YOUR_VIDEO_PATH
The results will be stored in the
. /video_data
Catalog.
Running with audio files
- Make sure the audio file is in .wav format, with a sample rate of 16kHz, 16-bit mono.
- Run the demo script:
python demo.py video_data/test video_data/audio0.wav 1.mp4
real time operation
- Use the microphone for real-time operation:
python demo_avatar.py
common problems
- Failed to unzip the model file: Please make sure that all sub-volume files are complete and properly unzipped.
- Incorrect audio file format: Please use a conforming .wav file.