General Introduction
PSHuman is a single-image 3D portrait reconstruction tool based on multi-view diffusion technology. The tool is capable of generating detailed geometric structures and realistic 3D portrait models from a single photo of a dressed person.PSHuman's core technology includes cross-scale multi-view diffusion, which is capable of generating high-quality 3D portraits in a short period of time. Developed by the pengHTYX team, the project aims to provide users with an efficient and easy-to-use 3D portrait modeling solution.
Function List
- Single Image 3D Portrait Reconstruction: Generate detailed 3D models from single portrait photos.
- Multi-view diffusion technique: Generate high-quality 3D portraits using cross-scale multi-view diffusion.
- SMPL-free version: Multi-view generation without SMPL condition, suitable for general pose portraits.
- Background Removal: Supports removing the background using the Clipdrop or rembg tool.
- Structured Output: Generated 3D models and rendered videos are saved as structured files for easy viewing and sharing.
Using Help
Installation process
- Create a virtual environment and install dependencies:
$ conda create -n pshuman python=3.10 $ conda activate pshuman $ pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121 $ pip install kaolin==0.17.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.1.0_cu121.html $ pip install -r requirements.txt
- Configure the associated model:
- Download ECON and SIFU related models and reorganize them into projects.
Usage Process
- Background Removal: Remove the background of a portrait photo using the Clipdrop or rembg tool. For the rembg tool, you can run the following script:
$ python utils/remove_bg.py --path $DATA_PATH$
Place the generated RGBA image into the Catalog.
- running inference: Generate the texture mesh and render the video by running the inference.py script:
$ CUDA_VISIBLE_DEVICES=$GPU python inference.py --config configs/inference-768-6view.yaml \ pretrained_model_name_or_path='pengHTYX/PSHuman_Unclip_768_6views' \ validation_dataset.crop_size=740 \ with_smpl=false \ validation_dataset.root_dir=$DATA_PATH$ \ seed=600 \ num_views=7 \ save_mode='rgb'
- Adjustment parameters: Adjust crop_size (720 or 740) and seed (42 or 600) as needed for best results.
Main Functions
- Single Image 3D Portrait Reconstruction: The user provides a portrait photo and the system will automatically generate a detailed 3D model.
- Multi-view diffusion technique: Generating high-quality 3D portraits using cross-scale multi-view diffusion techniques.
- Background Removal: Supports the removal of backgrounds using the Clipdrop or rembg tools to simplify subsequent processing.
- Structured Output: Generated 3D models and rendered videos are saved as structured files for easy viewing and sharing.
Detailed Operation Procedure
- Provide portrait photos: The user provides a portrait photo and processes it with a background removal tool.
- Running inference scripts: Generate 3D models and render videos by running the inference.py script.
- Adjustment parameters: Adjust the parameters in the inference script as needed to get the best results.
- View and Share: The generated 3D models and rendered videos are saved as structured files that can be directly viewed and shared by users.