General Introduction
EasyControl is an open source project, the project is based on the Diffusion Transformer (DiT) architecture, providing efficient and flexible image generation control. Among them, Ghibli Control LoRA is one of its featured functions, which can transform real portraits into Ghibli animation style while preserving facial features by training with only 100 Asian faces and their GPT-4o generated Ghibli style images.EasyControl supports a variety of conditional inputs, including edges, depths, poses, etc., and the Ghibli model is the Ghibli model is the highlight of the stylized generation. The project is licensed under the Apache 2.0 license for research purposes only. As of April 3, 2025, the latest updates include the Ghibli style model and an online demo.
Free experience: https://huggingface.co/spaces/jamesliu1217/EasyControl_Ghibli
Function List
- Convert portrait to Ghibli style: Input real face image to generate Ghibli animation style image.
- Preserve facial features: Training based on 100 Asian faces ensures no distortion of details after conversion.
- Supports a variety of conditional controls: including Edge (Canny), Depth (Depth), Pose (Pose), and more.
- Flexible Resolution Output: Supports image generation with different heights and widths.
- Efficient generation: combining causal attention mechanism and KV Cache technology to speed up inference.
- Plug-and-play modules: Ghibli LoRA integrates seamlessly with DiT models such as FLUX.1-dev.
Using Help
EasyControl is suitable for users with a technical background, especially researchers and creative workers. The following is a detailed guide to installing and using the Ghibli features.
Installation process
- Preparing the environment
Requires Python 3.10 and PyTorch with CUDA support. Create a Conda environment:
conda create -n easycontrol python=3.10
conda activate easycontrol
- clone warehouse
Download the EasyControl project:
git clone https://github.com/Xiaojiu-z/EasyControl.git
cd EasyControl
- Installation of dependencies
Install the required libraries:
pip install -r requirements.txt
GPU users need to make sure PyTorch supports CUDA.
- Download the Ghibli model
Get the Ghibli LoRA from Hugging Face:
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="Xiaojiu-Z/EasyControl", filename="models/Ghibli.safetensors", local_dir="./")
If it is not accessible, a mirror site is available:
export HF_ENDPOINT=https://hf-mirror.com
huggingface-cli download --resume-download Xiaojiu-Z/EasyControl --local-dir checkpoints
- Verify Installation
Run the test script:
python demo.py
If an image is generated, the installation was successful.
Main Functions
1. Generating Ghibli-style images
- procedure
Initialize the model and load the Ghibli LoRA:
import torch
from PIL import Image
from src.pipeline import FluxPipeline
from src.lora_helper import set_single_lora
device = "cuda"
base_path = "FLUX.1-dev" # 基础模型路径
pipe = FluxPipeline.from_pretrained(base_path, torch_dtype=torch.bfloat16).to(device)
set_single_lora(pipe.transformer, "models/Ghibli.safetensors", lora_weights=[1], cond_size=512)
prompt = "Ghibli Studio style, Charming hand-drawn anime-style illustration"
subject_image = Image.open("test_imgs/portrait.png").convert("RGB")
image = pipe(
prompt,
height=1024,
width=1024,
guidance_scale=3.5,
num_inference_steps=25,
subject_images=[subject_image],
cond_size=512,
generator=torch.Generator("cpu").manual_seed(1)
).images[0]
image.save("output/ghibli_result.png")
- in the end
Export Ghibli style images, save tooutput/ghibli_result.png
The
2. Online demonstration of use
- procedure
Visit the Hugging Face space at https://huggingface.co/spaces/jamesliu1217/EasyControl_Ghibli:- Upload a portrait image.
- Enter the prompt word:
Ghibli Studio style, Charming hand-drawn anime-style illustration
The - Set height and width (limited by hardware, default 256x256, high resolution requires local operation).
- Click "Generate Image" and wait 20-40 seconds.
- in the end
Generates low-resolution Ghibli-style images.
Featured Function Operation
High Resolution Generation
- procedure
Local runtime, modify the height and width parameters:image = pipe(prompt, height=1024, width=1024, ...)
- take note of
Requires at least 12GB of GPU memory or it may fail.
Clearing the cache
- procedure
Clear the cache after each generation:def clear_cache(transformer): for name, attn_processor in transformer.attn_processors.items(): attn_processor.bank_kv.clear() clear_cache(pipe.transformer)
Tips for use
- The cue must contain
Ghibli Studio style, Charming hand-drawn anime-style illustration
to trigger the style. - The input image is recommended to be a clear portrait with a resolution of 512x512 or more.
- The online demo is limited by hardware and only supports low resolution (256x256).
application scenario
- Animation Character Design
Convert real portraits to Ghibli style to quickly generate animated character prototypes. - art
Artist creates hand-drawn style illustrations with Ghibli model to improve efficiency. - Educational research
Researchers explore the application of conditional control in stylized generation.
QA
- Why is the resolution generated online low?
The online demo is hardware limited to 256x256 and needs to be run locally to generate 1024x1024 images. - What if the generated image doesn't look like Ghibli style?
Make sure the prompt contains the trigger word, or check that the input image is clear. - Does it support non-portrait input?
Yes, but the Ghibli model is optimized for faces and may not work well with other inputs.