General Introduction
FitDiT is a high-fidelity virtual fitting system based on Diffusion Transformers. Developed by Tencent AI Lab, the project aims to address the limitations of traditional virtual fitting systems in displaying clothing details.FitDiT innovatively proposes a new algorithmic architecture that better preserves the authentic details of clothing, making the virtual fitting effect more realistic. The project is fully open source, providing online demos, pre-trained models, and full code implementations to support researchers and developers for academic research and commercial pre-studies. The project released its paper in November 2024, and successively opened the online demo, dataset, and model weights in December 2024, which gained wide attention from academia and industry.
Function List
- Fully automated virtual fitting generation function
- Intelligent fitting area mask generation
- Manual Mask Adjustment and Editing Tools
- Multi-resolution fitting effect support
- Garment detail fidelity optimization
- Online demo platform support (Gradio interface)
- Local deployment support (multiple performance configurations supported)
- Complex Virtual Dressing Dataset (CVDD) Dataset
- Complete model training and inference code
- Hugging Face Model Hosting Integration
Using Help
1. Online access
FitDiT offers two ways to use it online:
- Hugging Face Space online demo: visit https://huggingface.co/spaces/BoyuanJiang/FitDiT
- Official online demo platform: visit http://demo.fitdit.byjiang.com/
Steps to use:
Step 1: Generate the fitting area mask
- Upload a picture of the person whose clothes you want to change
- Upload a picture of the target garment you want to try on
- Click the "Step1: Run Mask" button to generate the initial mask.
- If you need to adjust the mask range, you can:
- Use the slider to adjust the mask range:
- mask offset top: adjusts the upper border
- mask offset bottom: adjusts the lower border
- mask offset left: adjusts the left border
- mask offset right: adjusts the right border
- Manually modify the masked area using the brush tool
- Use the Eraser tool to refine the edges of the mask
- Use the slider to adjust the mask range:
Step 2: Generate fitting results
- Choose the desired fitting resolution
- Click on "Step2: Run Try-on" to start the generation.
- Wait for the model to finish processing to see the fitting results
2. Local deployment methodology
Environmental requirements:
torch==2.3.0
torchvision==0.18.0
diffusers==0.31.0
transformers==4.39.3
gradio==5.8.0
onnxruntime-gpu==1.20.1
Deployment Steps:
- Request access to FitDiT model weights:
- Visit https://huggingface.co/BoyuanJiang/FitDiT
- Download the model to the local directory after gaining access
- Run the local Gradio service:
Provides four operation modes to choose according to your hardware configuration:# Fastest mode (requires larger video memory): python gradio_sd3.py --model_path local_model_dir # FP16 precision mode: python gradio_sd3.py --model_path local_model_dir --fp16 # CPU assist mode (medium speed, moderate video memory): python gradio_sd3.py --model_path local_model_dir --fp16 --offload # Radical CPU load mode (slowest speed, least video memory usage): python gradio_sd3.py --model_path local_model_dir --fp16 --aggressive_offload
3. Instructions for use by developers
- Project follows CC BY-NC-SA-4.0 license
- For non-commercial use only
- For commercial licenses, contact byronjiang@tencent.com
- The complete model training code and dataset are open source
- Supports the use of pre-trained models via Hugging Face