General Introduction
Hallo2 is an open source project jointly developed by Fudan University and Baidu to generate high-resolution portrait animations through audio-driven generation. The project utilizes advanced Generative Adversarial Networks (GAN) and time alignment techniques to achieve 4K resolution and up to 1 hour of video generation.Hallo2 also supports text prompts to enhance the diversity and controllability of generated content.
Function List
- Audio Driven Animation Generation: Generate corresponding portrait animations from input audio files.
- High Resolution Support: Support for generating videos with 4K resolution to ensure clear picture quality.
- Long video generation: Can generate video content up to 1 hour long.
- Text Alert Enhancement: Control generated portrait expressions and actions with semantic text labels.
- open source: Full source code and pre-trained models are provided to facilitate secondary development.
- Multi-platform support: Supports running on multiple platforms such as Windows, Linux, etc.
Using Help
Installation process
- system requirements::
- Operating system: Ubuntu 20.04/22.04
- GPU: Graphics card supporting CUDA 11.8 (e.g. A100)
- Creating a Virtual Environment::
conda create -n hallo python=3.10 conda activate hallo
- Installation of dependencies::
pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cu118 pip install -r requirements.txt sudo apt-get install ffmpeg
- Download pre-trained model::
git lfs install git clone https://huggingface.co/fudan-generative-ai/hallo2 pretrained_models
Usage Process
- Preparing to enter data::
- Download and prepare the required pre-trained model.
- Prepare the source image and driver audio files.
- Running inference scripts::
python scripts/inference.py --source_image path/to/image --driving_audio path/to/audio
- View Generated Results::
- The generated video file will be saved in the specified output directory and can be viewed using any video player.
Detailed steps
- Download Code::
git clone https://github.com/fudan-generative-vision/hallo2 cd hallo2
- Create and activate a virtual environment::
conda create -n hallo python=3.10 conda activate hallo
- Install the necessary Python packages::
pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cu118 pip install -r requirements.txt
- Install ffmpeg::
sudo apt-get install ffmpeg
- Download pre-trained model::
git lfs install git clone https://huggingface.co/fudan-generative-ai/hallo2 pretrained_models
- Running inference scripts::
python scripts/inference.py --source_image path/to/image --driving_audio path/to/audio
- View Generated Results::
- The generated video file will be saved in the specified output directory and can be viewed using any video player.