Story-Adapter: generating continuous and consistent graphic illustrations based on a long story

Latest AI Resources7mos agoupdate AI Sharing Circle

2.6K 00

General Introduction

Story-Adapter is an innovative story visualization framework that converts textual stories into coherent image sequences. Developed by researchers, this project employs an iterative approach that requires no training to generate high-quality story illustrations. The framework is characterized by its ability to handle long stories, maintain semantic consistency between images, and generate meticulous interaction details.Story-Adapter is based on diffusion modeling techniques, and ensures the coherence and quality of the generated images through the Global Reference Cross Attention (GRCA) mechanism. The project is fully open source under the MIT license and provides a powerful story visualization tool for researchers and developers.

Function List

Support for visualization of long stories
Providing an iterative framework without training
Implementing the Global Reference Cross Attention (GRCA) mechanism
Maintaining semantic consistency between image sequences
Generate high-quality detailed interactions
Support for customized story input
Provide pre-trained model integration
Supports batch image generation
Real-time preview of visualization results
Supports GPU-accelerated processing

Using Help

Environment Configuration

System Requirements:
- Python 3.10.14
- PyTorch 2.2.2
- CUDA 12.1
- cuDNN 8.9.02
Installation Steps:

# 克隆仓库
git clone https://github.com/jwmao1/story-adapter.git
cd story-adapter
# 创建并激活conda环境
conda create -n StoryAdapter python=3.10
conda activate StoryAdapter 
# 安装依赖包
pip install -r requirements.txt

Download the necessary model files:
- RealVisXL_V4.0: downloaded from Hugging Face and placed in the ". /RealVisXL_V4.0" directory.
- CLIP Image Encoder: download and place in ". /IP-Adapter/sdxl_models/image_encoder" directory
- IP-adapter_sdxl: download and place in ". /IP-Adapter/sdxl_models/ip-adapter_sdxl.bin"

Usage

Basic demo run:

python run.py --base_model_path your_path/RealVisXL_V4.0 --image_encoder_path your_path/IP-Adapter/sdxl_models/image_encoder --ip_ckpt your_path//IP-Adapter/sdxl_models/ip-adapter_sdxl.bin

Customized story generation:

python run.py --base_model_path your_path/RealVisXL_V4.0 --image_encoder_path your_path/IP-Adapter/sdxl_models/image_encoder --ip_ckpt your_path//IP-Adapter/sdxl_models/ip-adapter_sdxl.bin --story [你的故事文本]

caveat

Ensure that all dependent packages and necessary model files are installed
Check if the GPU has enough memory, high performance GPUs are recommended.
Downloading and loading of the model is required for the first run, which may take a long time
The quality of the generated image depends on the quality of the input story and the level of detail of the description
It is recommended that long stories be processed in batches for best results

fault resolution

If you encounter CUDA-related errors, check if the CUDA version matches
Batch size can be adjusted when memory is low
When model loading fails, check if the file path is correct
The level of detail in the story description can be adjusted when the generation is not satisfactory