General Introduction
AI Toolkit by Ostris is an open source AI toolset focused on supporting Stable Diffusion and FLUX.1 models for training and image generation tasks. Created and maintained by developer Ostris and hosted on GitHub, the toolkit aims to provide a flexible platform for researchers and developers to fine-tune and experiment with models. It contains a variety of AI scripts that support functions such as LoRA extraction, batch image generation and layer-specific training. The project is currently in the development stage and some of the features may not be stable enough, but its high customizability makes it suitable for advanced users in the deep learning field. The toolset supports Linux and Windows systems, and an Nvidia GPU with at least 24GB of video memory is required to run FLUX.1 model training.
Function List
- model training:: Supports Stable Diffusion and FLUX.1 model fine-tuning for training LoRA and LoKr models.
- Image Generation: Generate batch images based on configuration files or text prompts.
- LoRA extraction and optimization: Provide LoRA and LoCON extraction tools to optimize model feature extraction.
- Layer-specific training: Specific neural network layers can be selected for training and weights can be flexibly adjusted.
- User Interface Support: Provides AI Toolkit UI and Gradio UI to simplify task management and model training operations.
- Data set processing: Automatically adjusts image resolution and groups images by bucket, supporting a wide range of image formats.
- Cloud Training: Support for running training tasks on RunPod and Modal platforms.
Using Help
Installation process
Linux System Installation
- clone warehouse: Run the following command in the terminal to download the code:
git clone https://github.com/ostris/ai-toolkit.git
cd ai-toolkit
- Updating submodules: Ensure that all dependent libraries are complete:
git submodule update --init --recursive
- Creating a Virtual Environment: Use Python 3.10 or later:
python3 -m venv venv
source venv/bin/activate
- Installation of dependencies: Install PyTorch first, then the other dependencies:
pip3 install torch
pip3 install -r requirements.txt
Windows System Installation
- clone warehouse: Run at the command prompt:
git clone https://github.com/ostris/ai-toolkit.git
cd ai-toolkit
- Updating submodules:
git submodule update --init --recursive
- Creating a Virtual Environment:
python -m venv venv
. \venv\Scripts\activate
- Installation of dependencies: Install the version of PyTorch that supports CUDA 12.4, then install the other dependencies:
pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements.txt
UI Interface Installation
- Installing Node.js: Ensure that Node.js 18 or later is installed on your system.
- Building the UI: Enter the ui directory and install the dependencies:
cd ui
npm install
npm run build
npm run update_db
- Running the UI: Startup screen:
npm run start
- Access to the UI: Type in your browser
http://localhost:8675
The
Main function operation flow
FLUX.1 model training
- Preparing the environment: Ensure GPU memory is at least 24GB, if used for display output, set it in the configuration file.
low_vram: true
, to quantize the model on the CPU. - Configuring FLUX.1-dev:
- Log in to Hugging Face and visitblack-forest-labs/FLUX.1-devAnd accept the license.
- In the project root directory, create the
.env
file, add theHF_TOKEN=your read key
The
- Configuring FLUX.1-schnell:
- Edit the configuration file (e.g.
train_lora_flux_schnell_24gb.yaml
), add:model. name_or_path: "black-forest-labs/FLUX.1-schnell" assistant_lora_path: "ostris/FLUX.1-schnell-training-adapter" is_flux: true quantize: true sample. guidance_scale: 1 sample_steps: 4
- Preparing the dataset: Create in the root directory
dataset
folder into the.jpg
,.jpeg
maybe.png
image and the corresponding.txt
Describe the file. - Edit Configuration File: Reproduction
config/examples/train_lora_flux_24gb.yaml
until (a time)config
directory, rename it tomy_config.yml
Modificationsfolder_path
is the dataset path. - running training: Implementation:
python run.py config/my_config.yml
The training results are saved in the specified output folder and can be paused with Ctrl+C and resumed from the most recent checkpoint.
Training with Gradio UI
- Log in to Hugging Face: Running
huggingface-cli login
The inputs havewrite
The key for the privilege. - Launch UI: Implementation:
python flux_train_ui.py
- Operation UI: Upload images, fill in descriptions, set parameters in the interface and click on training, when finished you can publish the LoRA model.
Training in the Cloud (RunPod)
- Creating a RunPod Instance: Use of templates
runpod/pytorch:2.2.0-py3.10-cuda12.1.1-devel-ubuntu22.04
, choose A40 (48GB video memory). - Installation Toolset: Connect to Jupyter Notebook and run the Linux install command in the terminal.
- Uploading a dataset: Create in the root directory
dataset
folder, drag in images and description files. - Configure and run: Modify the configuration file of the
folder_path
Implementationpython run.py config/my_config.yml
The
Data set preparation
- Formatting requirements: Support
.jpg
,.jpeg
,.png
format, the description file is.txt
The file names need to be consistent (e.g.image1.jpg
homologousimage1.txt
). - Description:
.txt
Write a description in the[trigger]
Placeholders, defined by the configuration file'strigger_word
Replacement. - automatic adjustment: The tool automatically shrinks and groups images according to the configured resolution, zooming is not supported.
Layer-specific training
- Edit Configuration File: In
network
Partially added:
network.
type: "lora"
linear: 128
linear_alpha: 128
network_kwargs.
only_if_contains.
- "transformer.single_transformer_blocks.7.proj_out"
- "transformer.single_transformer_blocks.20.proj_out"
- running training: Starts with a modified configuration file, training only the specified layer.
caveat
- break training: Avoid pressing Ctrl+C while saving checkpoints to avoid corrupting the file.
- UI Security: UI is currently only tested on Linux, which is less secure and not recommended for exposure to the public network.
- Getting Help: May join Ostris' Discord community to ask questions and avoid direct private messaging to developers.