AI Personal Learning
and practical guidance

tldraw computer: using multimodal models to orchestrate components in a flowchart whiteboard to enable content generation workflows

General Introduction

tldraw computer is an experimental project from tldraw designed to provide an infinite canvas for natural language computation. Users can create and connect components, generate and transform data, and utilize multimodal language models as runtime execution instructions. The platform allows users to create complex workflows with simple operations for a variety of data processing and generation tasks.

Have been thinking about a problem, for C-end users to program the intelligent body flow of the product should be what form, buckle, DIFY in fact, the threshold is not low, mainly for developers and professional content creators. This time tldraw gives a new direction. Although the canvas link component about complex context dependencies there are still some opaque logic, or defects, but for C-end users enough.


Workflow orchestration tools with similar features(but none of the input and output logic is the same):

Glif: code-free orchestration of AI workflows, output of templated images and HTML, free unlimited use of Flux 1.1pro

Takomo.ai: a code-free AI app building platform for multimodal workflows via canvas drag and drop

flowith: canvas-orchestrated AI chat tool | AI Intelligence Body

Refly: an AI writing platform based on process orchestration on a free canvas for automated article generation

 

tldraw computer: content generation commands using multimodal models as canvas connection components-1

 

tldraw computer: content generation commands using multimodal models as canvas connection components-1

 

Function List

  • Infinite Canvas: Provides an infinitely expandable canvas where users can freely add and connect components.
  • Component Creation: Users can create various functional components for data generation and transformation.
  • Workflow Management: Support for creating, editing and managing complex workflows, including branches and loops.
  • Multimodal Language Model: Execute natural language instructions using an advanced multimodal language model.
  • Sample Projects: Provides pre-built sample projects that users can quickly get started with and customize.

Using Help

Installation and Registration

  1. Visit https://computer.tldraw.com/.
  2. Click the "Get started" button to enter the registration page.
  3. Sign up for a new account with your Google account or email address, or sign in with an existing account.

Creating and using components

  1. After logging in, enter the Infinite Canvas screen.
  2. Click the "Create component" button to select the component type and configure it.
  3. Drag and drop components onto the canvas and use connecting lines to link the components together to form a workflow.
  4. Click on the component and enter a natural language instruction to execute the instruction using a multimodal language model.

Managing workflows

  1. Create multiple components on the canvas and form a workflow with connecting lines.
  2. Use the context menu or toolbar to make edits to the workflow, including adding branches and loops.
  3. Save the workflow and the system will automatically generate a project that the user can edit and run at any time.

Sample Projects

  1. Click the "Examples" button on the home page and select a pre-built example project.
  2. Sample projects include a story generator, sequencer, combat simulator and more that users can run directly or customize.
  3. Edit the sample project and save it as your own for further modification and optimization.

 

Gemini powers tldraw's "natural language computing" experience

tldraw computer: content generation commands using multimodal models as canvas connection components-1

 

Unlocking Natural Language Interaction with the Gemini API

Gemini APIs make it easy for developers to integrate advanced AI capabilities into their applications, opening up new possibilities for user experience and functionality. This article highlights how tldraw used Gemini to build the revolutionary "natural language computing" experience in its new project computer. It shows how startups can leverage Gemini to build a revolutionary "natural language computing" experience in their new project, computer. Gemini API and tldraw's canvas SDK for quick and easy integration of powerful AI features. tldraw team is about to release a computer with Gemini 1.5 Flash and is prototyping a future version with Gemini 2.0 Flash.

 

tldraw uses the Gemini API to bring the power of conversational AI to visual programming, allowing users to generate content and process information through natural language. This opens up exciting opportunities for more intuitive and efficient user experiences around AI, pushing the boundaries of visual communication.

 

The Vision Behind Computer

tldraw is dedicated to making diagram creation accessible and intuitive, with the vision of providing users with a more natural way to interact with their canvas. Founder Steve Ruiz wanted to utilize the power of tldraw's Unlimited Canvas SDK to create a dynamic work environment that incorporates generative AI. This vision led to the development of computer, an experimental application that allows users to create workflows from modules of text, images, and commands. At runtime, information flows from one component to the next, with the output of each generation serving as input for the next, creating a powerful flow that can branch, loop, and iterate to generate results.

Building with Gemini 2.0: An In-Depth Look at Computer

tldraw's computer is built on a network of interconnected "components" representing elements on the canvas (text boxes, images, audio clips, etc.). These components are connected by arrows that visualize the flow of data and transformations. Each component has an associated "process", i.e., a set of instructions to be executed based on inputs from connected components. A component can accept data from many other components and pass its output data to many other components - even itself! This component-based architecture, combined with the power and speed of Gemini 2.0 Flash, creates a fast and flexible system capable of handling a wide variety of tasks.

 

tldraw's computer combines AI visualization programming based on text generation (using Gemini 2.0) with an image generation model.

 

Here's how the Gemini 2.0 Flash prototype was designed to help performance:

  • Lightning fast process execution: Gemini 2.0 Flash allows for quick execution of processes. For example, the "Instructions" component might contain "Write a short jingle". When triggered, the component instantly generates a set of reusable step-by-step scripts that can transform any combination of inputs into a jingle script. The component then combines its current input (e.g., "New AI Smart Gloves for Cats" in the "Text" component) to generate a prompt for the final output, and passes this output to another linked "Text "component for presentation, or to other linked components such as Speech (for text-to-speech), Image (for visual generation), or other Command "command" components for further transformation.
  • Rich context with multiple modesMaximizing the features of tldraw's computer requires speed, capacity and capability. With multiple components providing data for each generation, Gemini 2.0 Flash's large context window is essential for taking all inputs into account and generating output, and it supports combining images and documents with text prompts.
  • Structured data: The flow of data between components must follow a consistent pattern. the structured JSON output of Gemini 2.0 Flash ensures that every component in a workflow recognizes any type of data and generates its output with the same structure, preventing stalls, optimizing execution, and ensuring that even large workflows complete reliably.
  • Dynamic process generation: In addition to executing predefined processes, Gemini 2.0 Flash can also dynamically generate processes. A user can type "Create a marketing campaign based on this product description" and Gemini 2.0 Flash will generate the required steps (processes) and components to build a workflow on the canvas based on the user's high-level request. This dynamic generation brings great potential to innovate the user experience and streamline workflows.

Quick wins in innovation

tldraw's rapid implementation of computer highlights the value of Gemini for startups: rapid prototyping, enhanced user experience through intuitive natural language interfaces, and efficient handling of structured data with models like Gemini 2.0 Flash. This combination enables small teams to quickly and cost-effectively create innovative AI capabilities.

"We wanted to show that any team can build ambitious projects using tldraw's canvas SDK. gemini flash is a great engine for a fast, multimodal, canvas-based workflow tool. With Gemini 2.0, and a better name, I'm sure we can launch computer as an independent startup."

-- Steve Ruiz, founder of tldraw

May not be reproduced without permission:Chief AI Sharing Circle " tldraw computer: using multimodal models to orchestrate components in a flowchart whiteboard to enable content generation workflows

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish