AI Toolkit by Ostris: Stable Diffusion and FLUX.1 Model Training Toolkit

🚀 Invitation to Experience: China's First AI IDE Intelligent Programming Software Trae Chinese version downloadThe DeepSeek-R1 and Doubao-pro are available for unlimited use!

General Introduction

AI Toolkit by Ostris is an open source AI toolset focused on supporting Stable Diffusion and FLUX.1 models for training and image generation tasks. Created and maintained by developer Ostris and hosted on GitHub, the toolkit aims to provide a flexible platform for researchers and developers to fine-tune and experiment with models. It contains a variety of AI scripts that support functions such as LoRA extraction, batch image generation and layer-specific training. The project is currently in the development stage and some of the features may not be stable enough, but its high customizability makes it suitable for advanced users in the deep learning field. The toolset supports Linux and Windows systems, and an Nvidia GPU with at least 24GB of video memory is required to run FLUX.1 model training.

AI Toolkit by Ostris: Stable Diffusion with FLUX.1 Model Training Toolkit-1

Function List

model training:: Supports Stable Diffusion and FLUX.1 model fine-tuning for training LoRA and LoKr models.
Image Generation: Generate batch images based on configuration files or text prompts.
LoRA extraction and optimization: Provide LoRA and LoCON extraction tools to optimize model feature extraction.
Layer-specific training: Specific neural network layers can be selected for training and weights can be flexibly adjusted.
User Interface Support: Provides AI Toolkit UI and Gradio UI to simplify task management and model training operations.
Data set processing: Automatically adjusts image resolution and groups images by bucket, supporting a wide range of image formats.
Cloud Training: Support for running training tasks on RunPod and Modal platforms.

Using Help

Installation process

Linux System Installation

clone warehouse: Run the following command in the terminal to download the code:

git clone https://github.com/ostris/ai-toolkit.git
cd ai-toolkit

Updating submodules: Ensure that all dependent libraries are complete:

git submodule update --init --recursive

Creating a Virtual Environment: Use Python 3.10 or later:

python3 -m venv venv
source venv/bin/activate

Installation of dependencies: Install PyTorch first, then the other dependencies:

pip3 install torch
pip3 install -r requirements.txt

Windows System Installation

clone warehouse: Run at the command prompt:

git clone https://github.com/ostris/ai-toolkit.git
cd ai-toolkit

Updating submodules:

git submodule update --init --recursive

Creating a Virtual Environment:

python -m venv venv
.\venv\Scripts\activate

Installation of dependencies: Install the version of PyTorch that supports CUDA 12.4, then install the other dependencies:

pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements.txt

UI Interface Installation

Installing Node.js: Ensure that Node.js 18 or later is installed on your system.
Building the UI: Enter the ui directory and install the dependencies:

cd ui
npm install
npm run build
npm run update_db

Running the UI: Startup screen:

npm run start

Access to the UI: Type in your browserhttp://localhost:8675The

Main function operation flow

FLUX.1 model training

Preparing the environment: Ensure GPU memory is at least 24GB, if used for display output, set it in the configuration file.low_vram: true, to quantize the model on the CPU.
Configuring FLUX.1-dev:

Log in to Hugging Face and visitblack-forest-labs/FLUX.1-devAnd accept the license.
In the project root directory, create the.envfile, add theHF_TOKEN=你的读取密钥The

Configuring FLUX.1-schnell:

Edit the configuration file (e.g.train_lora_flux_schnell_24gb.yaml), add:

model:
name_or_path: "black-forest-labs/FLUX.1-schnell"
assistant_lora_path: "ostris/FLUX.1-schnell-training-adapter"
is_flux: true
quantize: true
sample:
guidance_scale: 1
sample_steps: 4

Preparing the dataset: Create in the root directorydatasetfolder into the.jpg,.jpegmaybe.pngimage and the corresponding.txtDescribe the file.
Edit Configuration File: Reproductionconfig/examples/train_lora_flux_24gb.yamluntil (a time)configdirectory, rename it tomy_config.ymlModificationsfolder_pathis the dataset path.
running training: Implementation:

python run.py config/my_config.yml

The training results are saved in the specified output folder and can be paused with Ctrl+C and resumed from the most recent checkpoint.

Training with Gradio UI

Log in to Hugging Face: Runninghuggingface-cli loginThe inputs havewriteThe key for the privilege.
Launch UI: Implementation:

python flux_train_ui.py

Operation UI: Upload images, fill in descriptions, set parameters in the interface and click on training, when finished you can publish the LoRA model.

Training in the Cloud (RunPod)

Creating a RunPod Instance: Use of templatesrunpod/pytorch:2.2.0-py3.10-cuda12.1.1-devel-ubuntu22.04, choose A40 (48GB video memory).
Installation Toolset: Connect to Jupyter Notebook and run the Linux install command in the terminal.
Uploading a dataset: Create in the root directorydatasetfolder, drag in images and description files.
Configure and run: Modify the configuration file of thefolder_pathImplementationpython run.py config/my_config.ymlThe

Data set preparation

Formatting requirements: Support.jpg,.jpeg,.pngformat, the description file is.txtThe file names need to be consistent (e.g.image1.jpghomologousimage1.txt).
Description: .txtWrite a description in the[trigger]Placeholders, defined by the configuration file'strigger_wordReplacement.
automatic adjustment: The tool automatically shrinks and groups images according to the configured resolution, zooming is not supported.

Layer-specific training

Edit Configuration File: InnetworkPartially added:

network:
type: "lora"
linear: 128
linear_alpha: 128
network_kwargs:
only_if_contains:
- "transformer.single_transformer_blocks.7.proj_out"
- "transformer.single_transformer_blocks.20.proj_out"

running training: Starts with a modified configuration file, training only the specified layer.

caveat

break training: Avoid pressing Ctrl+C while saving checkpoints to avoid corrupting the file.
UI Security: UI is currently only tested on Linux, which is less secure and not recommended for exposure to the public network.
Getting Help: May join Ostris' Discord community to ask questions and avoid direct private messaging to developers.

AI Toolkit by Ostris: Stable Diffusion with FLUX.1 Model Training Toolkit

General Introduction

Function List

Using Help

Installation process

Linux System Installation

Windows System Installation

UI Interface Installation

Main function operation flow

FLUX.1 model training

Training with Gradio UI

Training in the Cloud (RunPod)

Data set preparation

Layer-specific training

caveat

Related articles

Recommended

Can't find AI tools? Try here!

FLUX.1 image generator (supports Chinese input)

Recent AI Hotspots

AI Tools Recommendations

AI Tools Classification