AI Personal Learning
and practical guidance
CyberKnife Drawing Mirror

Dify Releases Agent Node: Injecting Autonomous Decision-Making Capabilities into Workflow

Workflow automation is undergoing a new wave of change in the face of rapidly changing artificial intelligence technology. For a long time, traditional automated processes have relied on predetermined fixed actions, which are stretched when dealing with complex problems, like leaving a pianist to play a musical score mechanically, lacking flexibility and creativity.

However, with the rapid improvement of Large Language Modeling (LLM) reasoning capabilities, it has become possible to gradually hand over the decision-making power of certain parts of the workflow to LLM. Recently, the Dify platform officially launched the Agent node Strategy type plug-in, an innovative feature designed to provide users with a smarter and more autonomous workflow automation experience.


 

Relationship between Agent nodes and Strategy: decoupled design, flexible upgrades

Dify Workflow The core role of Agent nodes in LLMs is to break the rigidity of traditional workflows, so that certain aspects are no longer limited to fixed processes and tooling patterns. Instead, Agent nodes allow LLMs to make autonomous decisions and judgments at specific points in the process, thus responding to more complex and dynamic task requirements.

To enable flexibility and scalability of Agent nodes, Dify introduces the Agent Strategy (Agent Strategy is an extensible template that defines standardized input content and output formats. By developing specific Agent Strategy configuration interfaces, Dify allows users to apply advanced Agent Strategies such as CoT (Chain of Thought), ToT (Thinking Tree), GoT (Thinking Map) and BoT (Thinking Pillar), and even more complex semantic kernel strategies.

In the Dify platform, Agent nodes host the Agent Strategy and are tightly connected to the upstream and downstream nodes of the workflow. Similar to LLM nodes, Agent nodes focus on solving specific tasks and feed the final results to downstream nodes.

In order to more clearly understand the relationship between Agent nodes and Agent Strategy, it can be analogized to the engine and control system of a car:

  • Agent node (Execution Unit): Acts as the "decision center" in the workflow, scheduling resources, managing operational status, and documenting the complete reasoning process.
  • Agent Strategy (Decision Logic): As a pluggable module of reasoning algorithms, Agent Strategy defines the rules for using the tool and the problem solving paradigm.

This subtle decoupling design allows developers to independently upgrade the "power system" (Agent Strategy), without having to make major changes to the entire workflow architecture, which greatly improves the flexibility and maintainability of the system.

Currently, Dify comes with two classic Agent Strategy policies for users to choose from:

  • ReAct: The classic "think-act-observe" chain of reasoning that mimics human thought and action patterns.
  • Function Calling: Functional precision calls are supported, enabling precise calls to external tools or APIs.

Users can download these pre-defined strategies directly from the Dify Marketplace and quickly apply them to their own workflows. What's more, Dify has introduced an open policy development standard that encourages developers to work together to build a thriving Agent Strategy ecosystem. On the Dify platform, any developer can:

  • Quickly create custom policy plug-ins with the CLI tool.
  • Configuration forms and visualization components for custom policies.
  • Integrate cutting-edge academic algorithms such as Tree-of-Thoughts into Agent nodes.

This means that Dify is becoming an "innovation platform" for AI inference strategies, where every user is able to share in and benefit from the fruits of community co-creation.

 

Overview of Agent Node Functionality

The Functional Panorama shows the main functions of the Agent node.

blank

In the next section, we will introduce the specific usage and benefits of Agent nodes for general users and developers, respectively.

For the average user: drag-and-drop, transparent reasoning

1. Drag-and-drop for fast configuration

The Dify platform minimizes the barrier to using Agent nodes. Users can drag and drop Agent nodes directly into the workflow canvas from the Tools panel and configure them in three easy steps:

  • selective inference strategy: Select the appropriate Agent Strategy from the list of preset or customized strategies.
  • Binding tools/models: Bind the Agent node to the desired tool or language model.
  • Setting up a Reminder Template: Set up clear cue word templates to guide the LLM's reasoning and decision making based on task requirements.

2. Transparent reasoning process, real-time logs

A powerful feature of the Dify agent strategy is its built-in logging mechanism. This mechanism creates a tree structure of the agent's thought process, enabling visualization of the agent's execution path and facilitating debugging of complex multi-step reasoning.

blank

The real-time logs give the user a clear view:

  • Total Time / Token Consumption: Understand the resource consumption of the Agent node.
  • multiround thought process: Trace the multiple rounds of thinking and decision-making steps of the LLM.
  • Tool call trajectory: Monitors the logging of Agent node calls to external tools.

The transparent reasoning process and real-time log information greatly enhance the debugability and interpretability of Agent nodes, helping users better understand and optimize their workflow.

For developers: standardized development, flexible customization

For developers, Dify provides a standardized development kit to help developers quickly build and customize Agent Strategies. At the core of defining an Agent Strategy is the definition of the following modules, which specify how the language model works:

  • Handling user queries: Receive and parse natural language queries from users.
  • Choosing the right tool: Selection of appropriate tools based on the content of the query and the needs of the task.
  • Use the right parameter implementation tool: Calls the selected tool with the correct parameters.
  • Processing tool returns results: Parsing and processing the returned results from the execution of the tool.
  • Judging the timing of task completion: Determine when the task is complete and output the final answer.

blank

A standardized development suite that includes a library of policy configuration components (e.g. Model Selector / Tool Editor, etc.), a structured logging interface, and a sandbox testing environment greatly simplifies the policy development process.

The definition of a policy consists mainly of the identity and metadata of the policy, the required parameters (e.g., models, tools, queries, etc.), the types and constraints of the parameters, and the location of the policy implementation source code.

The execution process of an Agent is divided into three main phases: initialization, iterative loop and final answer.

  1. initialization phase: The system completes the necessary parameter configuration, tool setup, and context preparation.
  2. iterative cycle stage: The system prepares a prompt containing the current context and uses the tool information to invoke the Large Language Model (LLM). The system then parses the response from the LLM to determine whether a tool was invoked or a final answer was obtained. If a tool call is required, the system executes the appropriate tool and updates the context using the tool's output. This loop continues until the task is completed or the preset maximum number of iterations is reached.
  3. Final answer stage: The system returns the final answer or result.

The Dify platform supports defining policies declaratively via YAML files. For example, the following code illustrates a policy named function_calling.yaml Example of a configuration file for the

parameters: name: model
- name: model
type: model-selector
scope: tool-call&llm
- name: tools
type: array[tools]
- name: max_iterations
name: max_iterations
default: 5
name: tools: array[tools] name: max_iterations
source: function_calling.py
source: function_calling.py

This declarative architecture makes policy configuration as easy and intuitive as filling out a form, while supporting:

  • Parameter dynamic calibration: Dynamic validation of parameter types, scopes and dependencies.
  • Automatic rendering of multi-language tags: Configuration interface for automatic rendering of multi-language versions.

For more detailed information on policy definitions, please refer to the official Dify documentation: https://docs.dify.ai/plugins/schema-definition/agent

 

Future Outlook: Continuous Iteration, Infinite Possibilities

The Dify platform plans to continue iterating on Agent node functionality in the future and add more developer-oriented component libraries, for example:

  • Knowledge base mobilization capability
  • Memory component in Chatflow
  • Error handling and retry mechanisms
  • More Official Agent Strategies

blank

Users can download different Agent Strategies from the community and load them into different Agent nodes to solve various complex tasks according to their needs.

When trying out Agent nodes for the first time, users can use the three-node Chatflow to get a quick overview of how they work and to simulate the basic capabilities of an Agent. When solving more complex tasks, try advanced techniques such as routing and handoffs, and think of the Agent node as a powerful extension of the LLM node, solving complex problems in a step-by-step fashion.

For example, with Agent nodes, users can implement complex task processing capabilities similar to OpenAI ChatGPT-4o with Task (image below from community contributor Pascal).

blank

More advanced gameplay will be officially released in Dify 1.0.0, and more developers are welcome to contribute their own Agent Strategy to build a prosperous Dify ecosystem together!

CDN1
May not be reproduced without permission:Chief AI Sharing Circle " Dify Releases Agent Node: Injecting Autonomous Decision-Making Capabilities into Workflow

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish