PSHuman: Generate realistic 3D portrait models, use a photo to generate 3D human modeling

Latest AI Resources8mos agorelease AI Sharing Circle

1.8K 00

General Introduction

PSHuman is a single-image 3D portrait reconstruction tool based on multi-view diffusion technology. The tool is capable of generating detailed geometric structures and realistic 3D portrait models from a single photo of a dressed person.PSHuman's core technology includes cross-scale multi-view diffusion, which is capable of generating high-quality 3D portraits in a short period of time. Developed by the pengHTYX team, the project aims to provide users with an efficient and easy-to-use 3D portrait modeling solution.

Function List

Single Image 3D Portrait Reconstruction: Generate detailed 3D models from single portrait photos.
Multi-view diffusion technique: Generate high-quality 3D portraits using cross-scale multi-view diffusion.
SMPL-free version: Multi-view generation without SMPL condition, suitable for general pose portraits.
Background Removal: Supports removing the background using the Clipdrop or rembg tool.
Structured Output: Generated 3D models and rendered videos are saved as structured files for easy viewing and sharing.

Using Help

Installation process

Create a virtual environment and install dependencies:

$ conda create -n pshuman python=3.10
$ conda activate pshuman
$ pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
$ pip install kaolin==0.17.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.1.0_cu121.html
$ pip install -r requirements.txt

Configure the associated model:
- Download ECON and SIFU related models and reorganize them into projects.

Usage Process

Background Removal: Remove the background of a portrait photo using the Clipdrop or rembg tool. For the rembg tool, you can run the following script:
```
$ python utils/remove_bg.py --path $DATA_PATH$
```
Place the generated RGBA image into the Catalog.

running inference: Generate the texture mesh and render the video by running the inference.py script:

$ CUDA_VISIBLE_DEVICES=$GPU python inference.py --config configs/inference-768-6view.yaml \
pretrained_model_name_or_path='pengHTYX/PSHuman_Unclip_768_6views' \
validation_dataset.crop_size=740 \
with_smpl=false \
validation_dataset.root_dir=$DATA_PATH$ \
seed=600 \
num_views=7 \
save_mode='rgb'

Adjustment parameters: Adjust crop_size (720 or 740) and seed (42 or 600) as needed for best results.

Main Functions

Single Image 3D Portrait Reconstruction: The user provides a portrait photo and the system will automatically generate a detailed 3D model.
Multi-view diffusion technique: Generating high-quality 3D portraits using cross-scale multi-view diffusion techniques.
Background Removal: Supports the removal of backgrounds using the Clipdrop or rembg tools to simplify subsequent processing.
Structured Output: Generated 3D models and rendered videos are saved as structured files for easy viewing and sharing.

Detailed Operation Procedure

Provide portrait photos: The user provides a portrait photo and processes it with a background removal tool.
Running inference scripts: Generate 3D models and render videos by running the inference.py script.
Adjustment parameters: Adjust the parameters in the inference script as needed to get the best results.
View and Share: The generated 3D models and rendered videos are saved as structured files that can be directly viewed and shared by users.