present (sb for a job etc)
There was a time when creating a comic book was a tedious process that required writers, illustrators, and countless hours of effort. Today, artificial intelligence serves as a powerful tool that empowers creative professionals. Imagine handing over a short story to AI and watching how it helps transform that story into a vivid, visually stunning comic book - while retaining the creator's unique perspective. This is no longer just a fantasy; it's a reality made possible by cutting-edge generative AI models. In this blog, we'll exploreCrewAI How LLM Agents Enhance the Creative Process of Comic Book Creation delves into the structure and implementation that makes this magic possible.
Example: Creating a Storybook from Panchatantra
To demonstrate this process, let's use a short story from the Panchatantra, a collection of ancient Indian fables known for their wisdom and moral lessons. Consider the story of the Lion and the Hare:
Short story: "Once upon a time, there was a mighty lion named Basuraka who ran roughshod over the jungle. The animals got tired of his tyranny and decided to send him a prey every day. One day, it was the turn of the clever rabbit, who devised a plan to get rid of the lion. He lured Basuraka to a deep well and convinced him that another lion lived there. Seeing his reflection in the water, Basuraka roared in anger and jumped into the well, never to return."
Using the CrewAI framework, we will follow the steps below:
- 1. Scriptwriting Agents: Scriptwriters will break the story down into scenes, for example:
- - Scene 1: Lions roam the jungle.
- - Scene 2: The animals decide to deliver one prey per day.
- - Scene 3: The rabbit plans to trick the lion.
- - Scene 4: The lion jumps into the well.
- 1. Visual Artist Agency: Visual artists will generate illustrations for each scene depicting key moments such as the lion roaring through the jungle, the rabbit guiding the lion to the well, and the final scene of the lion jumping into the water.
- 2. Synthesizer Agent: In the end, the synthesizer combines all of these scenes and images into a coherent storybook ready to be viewed and shared.
For more detailed information about the story of Panchatantra, you can refer to external resources such as Panchatantra on Wikipedia or the Panchatantra story collection.
Automated Authoring Generation with LLM Agent
Generative AI agents can be thought of as a digital "team" working together to perform complex creative processes. By assigning specific tasks to individual AI agents, the process of creating an entire comic book becomes efficient and automated. In the following illustration, specialized agents work together:
- 1. script writer: Responsible for translating short stories into detailed breakdown scenes.
- 2. visual artist: Responsible for transforming each scene into a compelling piece of visual art.
- 3. synthesizers: Responsible for merging all generated scenes and their corresponding images into a coherent and complete comic book. The synthesizer ensures that the narrative flows smoothly and the final product is ready for release.
The synergy between these agents automates the comic book creation process, enabling an efficient and creative workflow. The key lies in the ability to utilize generative language models and image generation AI systems in a coordinated manner.
Architecture Overview
The architecture of this automation is simple but effective. The process starts with a short story and flows as follows:
- 1. Short story input: The process uses short narratives as inputs that serve as the basis for the comic book.
- 2. Scriptwriting Agents: The agent breaks the short story into discrete scenes, each capturing an important part of the storyline. In the illustration, this is shown as scenes labeled "Scene 1, Scene 2, Scene 3", etc., until the entire story is broken down into smaller scenes.
- 3. Visual Artist Agency: The visual artist is responsible for translating each scene description into a visual representation, effectively illustrating the comic. Visual elements are created as images to represent scenes like the lion in the sun, the lion meeting the rabbit, etc.
- 4. synthesizers: Finally, all scenes and their corresponding images are combined by a synthesizer agent to create a complete picture book.
The entire process is designed to seamlessly transform narratives into engaging comic books that require minimal human intervention.
Implemented using the CrewAI framework
To bring this vision to life, we implemented the CrewAI framework with three agents working in harmony. Below are the detailed steps of the implementation process, with placeholders for code snippets to help you reproduce the process step-by-step:
Defining Agents and Tasks: We use the CrewAI framework to define two agents - Agent 1 (script writer) and Agent 2 (visual artist). Both agents have specific roles with interrelated tasks for efficient workflow.
## Agent scriptwriter. role: > Scripting scenes for children's short stories goal: > Write simple, clear and engaging scene scripts for children's picture books. backstory: > You are a scriptwriter who specializes in turning children's short stories into scripts for performance or animation. llm: llm_model ## Tasks scriptwriting. Description: > You will be given a short children's story about learning important lessons about life. The story needs to be translated into a fun picture book for children to read and engage with. You are responsible for breaking the story down into {number_of_scenes} unique scenes, each focusing on a specific event or moment in the story. Each scene will be translated into an image. You must generate the following information, following the specified pydantic pattern: - A suitable name for the story - A short summary of the story - A short backstory of the story, providing important information for the reader. - A detailed narrative of each scene in the story, at least one or two sentences. - The ultimate lesson learned from the story. <short_story {story_text} The output must strictly follow the pydantic pattern. If it does not, there will be a penalty. agent: scriptwriter ## agent visualartist. role: > Visual illustration for storybook goal: > Create engaging picture books. backstory: > Expert in creating picture storybooks. llm: llm_model ## Task illustration. Description: > You will be given a short children's story about learning important lessons about life. The story will be transformed into a fun picture book for children to read and engage with. The story has been broken down into unique scenes. Below is a description of one scene and a short summary of that scene is also given below. Generate a prompt that can be used in a text to image model to produce an image of that scene. Send the prompt to the provided tool to generate images of characters and backgrounds that match the requirements of the scene. Characters should be cartoon style. The prompt should be less than 40 characters. <story_summary {story_summary} <story_summary {scene_description} The output must strictly follow the pydantic pattern. There will be a penalty if it does not. agent: visualartist
Team Configuration: Define a structured schema for agents to generate responses and llm models, such as OpenAI and DaLLE models, and bind agents to their tasks.
dalle_tool = DallETool(model="dall-e-3", model="dall-e-3", size="1024x1024", quality="standard", n=1) n=1) ## Define a class for a single scene class StoryScene(BaseModel). scene_number: int narration: str ## Define a class for a list of story scenes. class StoryScenes(BaseModel): story_name: str story_name: str summary: str story_name: str summary: str story_name: str summary: str background: str scenes: List[StoryScene] ## Define a class for a single scene class SceneImage(BaseModel). prompt: str = Field(description = "A text-to-image model prompt that can be used to generate an image." , max_length = 50) image_url: str = Field(description = "URL of the image generated by the tool.") @CrewBase class StoryCrew(). """StoryCrew"""" agents_config = 'config/story/agents.yaml' tasks_config = 'config/story/tasks.yaml' @llm def llm_model(self). return ChatOpenAI(temperature=0.0, # set to 0 for deterministic output) model="gpt-4o-mini", # use GPT-4 Turbo model max_tokens=8000) @agent def scriptwriter(self) -> Agent. return Agent( config=self.agents_config['scriptwriter'], output_pydantic = StoryScenes, verbose=True ) @task def scriptwriting(self) -> Task. return Task( config=self.tasks_config['scriptwriting'], output_pydantic = StoryScenes, ) @crew def crew(self) -> Crew. """Create Story Crew"""" script_crew = Crew( agents=self.agents, # Created automatically by @agent decorator tasks=self.tasks, # created automatically by @task decorator process=Process.sequential, verbose=True, # automatically created by @task decorator process=Process.sequential, verbose=True, # process=Process. # process=Process.hierarchical, # If you want to use this instead, see https://docs.crewai.com/how-to/Hierarchical/ ) return script_crew @CrewBase class ArtistCrew(). agents_config = 'config/visual/agents.yaml' tasks_config = 'config/visual/tasks.yaml' @llm def llm_model(self). return ChatOpenAI(temperature=0.0, # set to 0 for deterministic output) model="gpt-4o-2024-08-06", # use GPT-4 Turbo model max_tokens=8000) @agent def visualartist(self) -> Agent. return Agent( config=self.agents_config['visualartist'], tools=[dalle_tool], verbose=True ) @task def illustration(self) -> Task. return Task( config=self.tasks_config['illustration'], output_pydantic = SceneImage, output_file='report.md' ) @crew def crew(self) -> Crew. """Create a picture book crew"""" artist_crew = Crew( agents=self.agents, # Created automatically by @agent decorator tasks=self.tasks, # created automatically by @task decorator process=Process.sequential, verbose=True, # automatically created by @task decorator process=Process.sequential, verbose=True, # process=Process. # process=Process.hierarchical, # If you want to use this instead, see https://docs.crewai.com/how-to/Hierarchical/ ) return artist_crew
Main workflow: Ensure proper handoffs between the two agents. For example, once a script writer completes a scene, it is automatically passed on to the visual artist, ensuring continuity of workflow.
agentops.start_session( tags = ['story', 'scripts'] ) ## Create a hypothesis or generate a question using QuestCrew inputs = { 'number_of_scenes': int(number_of_scenes), 'story_text': story_text, } scenes_list = StoryCrew().crew().kickoff(inputs=inputs) agentops.end_session("Success") if scenes_list is not None. print(f "Raw result from script writing: {scenes_list.raw}") slist = scenes_list.pydantic story_summary = slist.summary for scene in slist.scenes: {f "Scene: {scenes_list.raw}") print(f "Scene: {scene.narration}") 'scene_description': scene.narration} for i, scene in enumerate(slist.scenes)] agentops.start_session(tags = ['scene', 'illustration']) ## Run the agent result_images = ArtistCrew().crew().kickoff_for_each(inputs = scene_input) print("result_images : {result_images.raw}")
reach a verdict
The power of generative AI lies in its ability to augment and support the creative process, providing content creators with new tools to bring their ideas to life.The CrewAI LLM agent provides help in transforming simple short stories into engaging comic picture books, assisting storytellers at every stage of the journey. By automating repetitive tasks such as script decomposition and visual generation, AI enables artists and writers to focus more on core creative elements, preserving their unique artistic style. This implementation demonstrates how generative AI can enhance the creative industry, providing a vision of a future where creativity and technology work together seamlessly.