AI Personal Learning
and practical guidance

LiveTalking: open source real-time interactive digital human live system, to achieve synchronous audio and video dialogues

General Introduction

LiveTalking is an open source real-time interactive digital human system, dedicated to building high-quality digital human live solution. The project uses the Apache 2.0 open source protocol and integrates a number of cutting-edge technologies , including ER-NeRF rendering , real-time audio and video stream processing , lip synchronization and so on. The system supports real-time digital human rendering and interaction, and can be used for live broadcasting, online education, customer service and many other scenarios. The project has gained more than 4300 stars and 600 branches on GitHub, showing a strong community influence.LiveTalking pays special attention to real-time performance and interactive experience, and provides users with a complete digital human development framework by integrating AIGC technology. The project is continuously updated and maintained, and is supported by comprehensive documentation, making it an ideal choice for building digital person applications.

LiveTalking: open source real-time interactive digital human live system, to achieve synchronized audio and video dialog-1


 

Function List

  • Support for multiple digital human models:ernerf,musetalk,wav2lip,Ultralight-Digital-Human
  • Synchronize audio and video conversations
  • Support for sound cloning
  • Pro-digital people speaking up and being interrupted
  • Supports full-body video splicing
  • Support for RTMP and WebRTC push streams
  • Support for video scheduling: play customized videos when not speaking
  • Supports multiple concurrency

 

Using Help

1.Installation process

  1. Environmental requirements : Ubuntu 20.04, Python 3.10, Pytorch 1.12, CUDA 11.3
  2. Installation of dependencies ::
conda create -n nerfstream python=3.10
conda activate nerfstream
conda install pytorch==1.12.1 torchvision==0.13.1 cudatoolkit=11.3 -c pytorch
pip install -r requirements.txt

If you don't train. ernerf model, the following libraries do not need to be installed:

pip install "git+https://github.com/facebookresearch/pytorch3d.git"
pip install tensorflow-gpu==2.8.0
pip install --upgrade "protobuf<=3.20.1"

2. Quick Start

  1. Running SRS ::
export CANDIDATE=''
docker run --rm --env CANDIDATE=$CANDIDATE -p 1935:1935 -p 8080:8080 -p 1985:1985 -p 8000:8000/udp registry.cn-hangzhou.aliyuncs.com/ossrs/ srs:5 objs/srs -c conf/rtc.conf

Note: The server needs to open ports tcp:8000,8010,1985; udp:8000

  1. Launching Digital People ::
python app.py

If you can't access Huggingface, execute it before running:

export HF_ENDPOINT=https://hf-mirror.com

Open with your browser http://serverip:8010/rtcpushapi.html, enter any text in the text box, submit it, and the digital person will broadcast the passage.

More instructions for use

  • Docker running : No need for the previous installation, just run it:
docker run --gpus all -it --network=host --rm registry.cn-beijing.aliyuncs.com/codewithgpu2/lipku-metahuman-stream:vjo1Y6NJ3N

The code is in the /root/metahuman-streamprior git pull Pull the latest code, then execute the command as in steps 2 and 3.

3. Configuration instructions

  1. System Configuration
  • Edit the config.yaml file to set the basic parameters
  • Configuring cameras and audio devices
  • Setting AI model parameters and paths
  • Configuring Live Push Streaming Parameters
  1. Digital human model preparation
  • Support for importing custom 3D models
  • Pre-built example models can be used
  • Supports MetaHuman model import

Main Functions

  • Real-time audio and video synchronized conversation::
    1. Select Digitizer Model: Select the appropriate digitizer model (e.g. ernerf, musetalk, etc.) in the configuration page.
    2. Selection of audio/video transmission method: Select the appropriate audio/video transmission method (e.g. WebRTC, RTMP, etc.) according to the requirements.
    3. Start a conversation: Start the audio/video transmission to realize real-time audio/video synchronous conversation.
  • Digital human model switching::
    1. Enter the Setup Page: In the Project Run page, click the Setup button to enter the Setup page.
    2. Select New Model: Select a new Digimon model in the Settings page and save the settings.
    3. Restart Project: restarts the project to apply the new model configuration.
  • Audio and video parameter adjustment::
    1. Enter the parameter setting page: In the project running page, click the parameter setting button to enter the parameter setting page.
    2. Adjustment parameters: Adjust audio and video parameters (e.g., resolution, frame rate, etc.) as required.
    3. Save and Apply: Saves the settings and applies the new parameter configuration.
May not be reproduced without permission:Chief AI Sharing Circle " LiveTalking: open source real-time interactive digital human live system, to achieve synchronous audio and video dialogues

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish