General Introduction
Story-Adapter is an innovative story visualization framework that converts textual stories into coherent image sequences. Developed by researchers, this project employs an iterative approach that requires no training to generate high-quality story illustrations. The framework is characterized by its ability to handle long stories, maintain semantic consistency between images, and generate meticulous interaction details.Story-Adapter is based on diffusion modeling techniques, and ensures the coherence and quality of the generated images through the Global Reference Cross Attention (GRCA) mechanism. The project is fully open source under the MIT license and provides a powerful story visualization tool for researchers and developers.
Function List
- Support for visualization of long stories
- Providing an iterative framework without training
- Implementing the Global Reference Cross Attention (GRCA) mechanism
- Maintaining semantic consistency between image sequences
- Generate high-quality detailed interactions
- Support for customized story input
- Provide pre-trained model integration
- Supports batch image generation
- Real-time preview of visualization results
- Supports GPU-accelerated processing
Using Help
Environment Configuration
- System Requirements:
- Python 3.10.14
- PyTorch 2.2.2
- CUDA 12.1
- cuDNN 8.9.02
- Installation Steps:
# Clone the repository
git clone https://github.com/jwmao1/story-adapter.git
cd story-adapter
# Create and activate the conda environment
conda create -n StoryAdapter python=3.10
conda activate StoryAdapter
# install dependencies
pip install -r requirements.txt
- Download the necessary model files:
- RealVisXL_V4.0: downloaded from Hugging Face and placed in the ". /RealVisXL_V4.0" directory.
- CLIP Image Encoder: download and place in ". /IP-Adapter/sdxl_models/image_encoder" directory
- IP-adapter_sdxl: download and place in ". /IP-Adapter/sdxl_models/ip-adapter_sdxl.bin"
Usage
- Basic demo run:
python run.py --base_model_path your_path/RealVisXL_V4.0 --image_encoder_path your_path/IP-Adapter/sdxl_models/image_encoder --ip_ckpt your_path//IP-Adapter/sdxl_models/ip-adapter_sdxl.bin
- Customized story generation:
python run.py --base_model_path your_path/RealVisXL_V4.0 --image_encoder_path your_path/IP-Adapter/sdxl_models/image_encoder --ip_ckpt your_path//IP-Adapter/sdxl_models/ip-adapter_sdxl.bin --story [text of your story]
caveat
- Ensure that all dependent packages and necessary model files are installed
- Check if the GPU has enough memory, high performance GPUs are recommended.
- Downloading and loading of the model is required for the first run, which may take a long time
- The quality of the generated image depends on the quality of the input story and the level of detail of the description
- It is recommended that long stories be processed in batches for best results
fault resolution
- If you encounter CUDA-related errors, check if the CUDA version matches
- Batch size can be adjusted when memory is low
- When model loading fails, check if the file path is correct
- The level of detail in the story description can be adjusted when the generation is not satisfactory