AI Personal Learning
and practical guidance
讯飞绘镜

PSHuman: Generate realistic 3D portrait models, use a photo to generate 3D human modeling

General Introduction

PSHuman is a single-image 3D portrait reconstruction tool based on multi-view diffusion technology. The tool is capable of generating detailed geometric structures and realistic 3D portrait models from a single photo of a dressed person.PSHuman's core technology includes cross-scale multi-view diffusion, which is capable of generating high-quality 3D portraits in a short period of time. Developed by the pengHTYX team, the project aims to provide users with an efficient and easy-to-use 3D portrait modeling solution.

PSHuman:生成逼真3D人像模型,使用一张照片生成3D人建模-1


 

Function List

  • Single Image 3D Portrait Reconstruction: Generate detailed 3D models from single portrait photos.
  • Multi-view diffusion technique: Generate high-quality 3D portraits using cross-scale multi-view diffusion.
  • SMPL-free version: Multi-view generation without SMPL condition, suitable for general pose portraits.
  • Background Removal: Supports removing the background using the Clipdrop or rembg tool.
  • Structured Output: Generated 3D models and rendered videos are saved as structured files for easy viewing and sharing.

 

Using Help

Installation process

  1. Create a virtual environment and install dependencies:
    $ conda create -n pshuman python=3.10
    $ conda activate pshuman
    $ pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
    $ pip install kaolin==0.17.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.1.0_cu121.html
    $ pip install -r requirements.txt
    
  2. Configure the associated model:
    • Download ECON and SIFU related models and reorganize them into projects.

Usage Process

  1. Background Removal: Remove the background of a portrait photo using the Clipdrop or rembg tool. For the rembg tool, you can run the following script:
    $ python utils/remove_bg.py --path $DATA_PATH$
    

    Place the generated RGBA image into the PSHuman:生成逼真3D人像模型,使用一张照片生成3D人建模-1 Catalog.

  2. running inference: Generate the texture mesh and render the video by running the inference.py script:
    $ CUDA_VISIBLE_DEVICES=$GPU python inference.py --config configs/inference-768-6view.yaml \
    pretrained_model_name_or_path='pengHTYX/PSHuman_Unclip_768_6views' \
    validation_dataset.crop_size=740 \
    with_smpl=false \
    validation_dataset.root_dir=$DATA_PATH$ \
    seed=600 \
    num_views=7 \
    save_mode='rgb'
    
  3. Adjustment parameters: Adjust crop_size (720 or 740) and seed (42 or 600) as needed for best results.

Main Functions

  • Single Image 3D Portrait Reconstruction: The user provides a portrait photo and the system will automatically generate a detailed 3D model.
  • Multi-view diffusion technique: Generating high-quality 3D portraits using cross-scale multi-view diffusion techniques.
  • Background Removal: Supports the removal of backgrounds using the Clipdrop or rembg tools to simplify subsequent processing.
  • Structured Output: Generated 3D models and rendered videos are saved as structured files for easy viewing and sharing.

Detailed Operation Procedure

  1. Provide portrait photos: The user provides a portrait photo and processes it with a background removal tool.
  2. Running inference scripts: Generate 3D models and render videos by running the inference.py script.
  3. Adjustment parameters: Adjust the parameters in the inference script as needed to get the best results.
  4. View and Share: The generated 3D models and rendered videos are saved as structured files that can be directly viewed and shared by users.
May not be reproduced without permission:Chief AI Sharing Circle " PSHuman: Generate realistic 3D portrait models, use a photo to generate 3D human modeling
en_USEnglish