DisPose: generating videos with precise control of human posture, creating dancing ladies

Latest AI Resources8mos agorelease AI Sharing Circle

1.6K 00

General Introduction

DisPose is an innovative open source artificial intelligence project focused on controlled character image animation generation. Developed by a team of researchers and open-sourced on GitHub, the project employs advanced deep learning techniques to achieve precise character animation control by decomposing skeletal pose information.The core innovation of DisPose is to decompose sparse skeletal pose information into two key components, namely, motion field guidance and key-point correspondence, a unique approach that makes the generated animation more natural and smooth, and with greater This unique approach makes the generated animation more natural and smooth, and more controllable. The project not only provides a complete code implementation, but also includes pre-trained models to enable researchers and developers to quickly deploy and use this technology.

Function List

Human posture detection and key point extraction
Sports field generation and control
Character image animation compositing
Precise control of multiple joints
Face and hand detailing
Batch video processing capability
Postural Migration and Motion Redirection
Real-time attitude estimation and tracking
Customized animation control parameter adjustment
High quality animation output

Using Help

1. Environmental configuration

DisPose requires the following basic environment configuration:

Python 3.10 or higher
PyTorch 2.0.1 and above
TorchVision 0.15.2 and above
CUDA 12.4 (for GPU acceleration)

Installation Steps:

# 创建conda环境
conda create -n dispose python==3.10
conda activate dispose
# 安装依赖
pip install -r requirements.txt

2. Model preparation

Download the pre-trained model weights file from Hugging Face:
- Visit https://huggingface.co/lihxxx/DisPose
- Download the DisPose.pth file
- Place the file in the . /pretrained_weights/ directory

3. Core functionality utilization process

3.1 Attitude Detection

The system uses a DWPose detector for human posture detection that recognizes the following key points:

Joint points of body bones (18)
Facial feature points (68)
Key points of the hand (21/hand)

3.2 Image Preprocessing

# 处理参考图像
ref_image = load_image(image_path)
pose_img, ref_pose = get_image_pose(ref_image)

3.3 Video Processing

# 处理视频序列
video_pose, body_points, face_points = get_video_pose(
video_path=video_path,
ref_image=ref_image,
sample_stride=1
)

3.4 Animation Generation Control

The system provides several parameters for controlling animation generation:

Stadium intensity regulation
Key points correspond to weights
Degree of postural migration
Timing Smoothness

4. Description of advanced functions

Posture Migration:
- Supports pose migration from source video to target character
- Keeping the character's identity the same
- Automatically adapts to different body size differences
Action Editor:
- Support for local action modification
- Provide keyframe editing function
- Adjustable speed and amplitude of movement
Batch processing capability:
- Support batch video processing
- Provides parallel processing options
- Automatic resource scheduling optimization

5. Cautions

Ensure that the quality of the input image is clear and that the character's posture is fully visible
GPU video memory recommended to be at least 8GB or more
Be careful to adjust the sample_stride parameter when processing high resolution video.
Regularly check and update the version of dependency packages
Suggests small-scale testing before processing large amounts of data

6. Resolution of common problems

Memory issues:
- Release unused resources with release_memory()
- Resize batches appropriately
- Testing with low resolution
Performance Optimization:
- Enable GPU acceleration
- Use appropriate sampling step size
- Optimize input image resolution
Quality Improvement:
- Use of high-quality reference images
- Adjustment of model parameters
- Perform post-processing optimization