AI Personal Learning
and practical guidance
讯飞绘镜

Microsoft AI Agent Introductory Course: Planning and Design

summary

This article will cover the following:

  • Define clear overarching goals and break down complex tasks into manageable subtasks.
  • Get more reliable and machine-readable responses with structured outputs.
  • Apply an event-driven approach to dynamic tasks and unexpected inputs.

 

Learning Objectives

Upon completion of this article, you will have an understanding of

  • Identify and set overarching goals for the AI Agent and make sure it knows exactly what needs to be achieved.
  • Break down complex tasks into manageable subtasks and organize them into a logical sequence.
  • Equip the Agent with the right tools (e.g., search tools or data analytics tools), decide when and how to use them, and handle unexpected situations as they arise.
  • Evaluate subtask results, measure performance, and iterate operations to improve the final output.

 

Definition of overall objectives and decomposition of tasks

微软 AI Agent 入门课程:规划设计-1

Most real-world tasks are too complex to be accomplished in one step.The AI Agent needs a concise goal to guide its planning and actions. For example, consider the following goal:

“生成一个 3 天的旅行行程。”

While the statement is simple, it still needs to be improved. The clearer the goal, the better the Agent (and any human collaborators) can focus on achieving the right outcome, such as creating a comprehensive itinerary with flight options, hotel recommendations, and activity suggestions.

Breakdown of tasks

Large or complex tasks become more manageable when they are broken down into smaller, goal-oriented subtasks. For the travel itinerary example, the goal can be broken down into:

  • Flight Booking
  • Hotel Reservation
  • car rental
  • personalized

Each subtask can then be handled by a specialized Agent or process. One Agent might specialize in searching for the best flight deals, another in hotel bookings, and so on. The Coordinating Agent or "downstream" Agent can then compile these results into a cohesive itinerary for the end user.


This modular approach also allows for incremental enhancements. For example, specialized Agents can be added to provide dining recommendations or suggestions for local activities and refine the itinerary over time.

Structured Output

Large Language Models (LLMs) can generate structured output (e.g. JSON), which makes it easier for downstream Agents or services to parse and process. This is particularly useful in multi-agent environments where we can perform these tasks after receiving planning output. See hereBlog Postsfor a quick overview.

The following Python code snippet demonstrates a simple Planning Agent that breaks down goals into subtasks and generates a structured plan:

Planning Agent with Multi-Agent Orchestration

In this example, the Semantic Routing Agent receives a user request (e.g., "I need a hotel plan for my trip."). .

Then, the planner:

  • Receive Hotel Plan: The Planner receives messages from the user and generates a structured travel plan based on system prompts, including details of available Agents.
  • List Agents and their tools: The Agent registry contains a list of Agents (for example, for flights, hotels, car rentals, and events) and the features or tools they provide.
  • Route the plan to the appropriate Agent: Depending on the number of subtasks, the planner either sends the message directly to the dedicated Agent (for single-task scenarios) or coordinates it through the Group Chat Manager for multi-Agent collaboration.
  • Summarize the results: Finally, the planner summarizes the generated plan for clarity. The following Python code example illustrates these steps:
from pydantic import BaseModel
from enum import Enum
from typing import List, Optional, Union
class AgentEnum(str, Enum):
FlightBooking = "flight_booking"
HotelBooking = "hotel_booking"
CarRental = "car_rental"
ActivitiesBooking = "activities_booking"
DestinationInfo = "destination_info"
DefaultAgent = "default_agent"
GroupChatManager = "group_chat_manager"
# Travel SubTask Model
class TravelSubTask(BaseModel):
task_details: str
assigned_agent: AgentEnum # we want to assign the task to the agent
class TravelPlan(BaseModel):
main_task: str
subtasks: List[TravelSubTask]
is_greeting: bool
import json
import os
from typing import Optional
from autogen_core.models import UserMessage, SystemMessage, AssistantMessage
from autogen_ext.models.openai import AzureOpenAIChatCompletionClient
# Create the client with type-checked environment variables
client = AzureOpenAIChatCompletionClient(
azure_deployment=os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME"),
model=os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME"),
api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
)
from pprint import pprint
# Define the user message
messages = [
SystemMessage(content="""你是一个规划 Agent。
你的工作是根据用户的请求决定运行哪些 Agent。
以下是专门负责不同任务的可用 Agent:
- FlightBooking: 用于预订航班和提供航班信息
- HotelBooking: 用于预订酒店和提供酒店信息
- CarRental: 用于预订汽车和提供汽车租赁信息
- ActivitiesBooking: 用于预订活动和提供活动信息
- DestinationInfo: 用于提供有关目的地的信息
- DefaultAgent: 用于处理一般请求""", source="system"),
UserMessage(content="为一个有两个孩子的家庭创建一个从新加坡到墨尔本的旅行计划", source="user"),
]
response = await client.create(messages=messages, extra_create_args={"response_format": TravelPlan})
# Ensure the response content is a valid JSON string before loading it
response_content: Optional[str] = response.content if isinstance(response.content, str) else None
if response_content is None:
raise ValueError("Response content is not a valid JSON string")
# Print the response content after loading it as JSON
pprint(json.loads(response_content))

The following is the output of the preceding code, which can then be used to route this structured output to the assigned_agent and summarize the travel plan to the end user.

{
"is_greeting": "False",
"main_task": "Plan a family trip from Singapore to Melbourne.",
"subtasks": [
{
"assigned_agent": "flight_booking",
"task_details": "Book round-trip flights from Singapore to Melbourne."
},
{
"assigned_agent": "hotel_booking",
"task_details": "Find family-friendly hotels in Melbourne."
},
{
"assigned_agent": "car_rental",
"task_details": "Arrange a car rental suitable for a family of four in Melbourne."
},
{
"assigned_agent": "activities_booking",
"task_details": "List family-friendly activities in Melbourne."
},
{
"assigned_agent": "destination_info",
"task_details": "Provide information about Melbourne as a travel destination."
}
]
}

A sample notebook containing the previous code example can be found in thehere (literary)Find.

Iterative planning

Some tasks require iteration or replanning, where the outcome of one subtask affects the next. For example, if an Agent discovers an unexpected data format when booking a flight, it may need to adjust its strategy before continuing to book a hotel.

In addition, user feedback (e.g., a user decides they prefer an earlier flight) can trigger a partial re-planning. This dynamic, iterative approach ensures that the final solution is aligned with real-world constraints and changing user preferences.

For example, the sample code

```python
from autogen_core.models import UserMessage, SystemMessage, AssistantMessage
#.. same as previous code and pass on the user history, current plan 
messages = [
SystemMessage(content="""你是一个规划 Agent,负责优化旅行计划。
你的工作是根据用户的请求决定运行哪些 Agent。
以下是专门负责不同任务的可用 Agent:
- FlightBooking: 用于预订航班和提供航班信息
- HotelBooking: 用于预订酒店和提供酒店信息
- CarRental: 用于预订汽车和提供汽车租赁信息
- ActivitiesBooking: 用于预订活动和提供活动信息
- DestinationInfo: 用于提供有关目的地的信息
- DefaultAgent: 用于处理一般请求""", source="system"),
UserMessage(content="为一个有两个孩子的家庭创建一个从新加坡到墨尔本的旅行计划", source="user"),
AssistantMessage(content=f"先前的旅行计划 - {TravelPlan}", source="assistant")
]
# .. re-plan and send the tasks to respective agents
要获得更全面的规划,请查看 Magnetic One [博客文章](https://www.microsoft.com/research/articles/magentic-one-a-generalist-multi-agent-system-for-solving-complex-tasks),了解如何解决复杂任务。
## 总结
[](https://github.com/microsoft/ai-agents-for-beginners/blob/main/07-planning-design/README.md#summary)
在本文中,我们研究了一个示例,说明了如何创建一个规划器,该规划器可以动态选择定义的可用 Agent。规划器的输出分解任务并分配 Agent,以便执行它们。假设 Agent 可以访问执行任务所需的功能/工具。除了 Agent 之外,你还可以包括其他模式,如反思、摘要器和循环聊天以进行进一步定制。
## 其他资源
[](https://github.com/microsoft/ai-agents-for-beginners/blob/main/07-planning-design/README.md#additional-resources)
*   AutoGen Magentic One - 一个通用的多 Agent 系统,用于解决复杂的任务,并在多个具有挑战性的 Agent 基准测试中取得了令人印象深刻的结果。参考:[autogen-magentic-one](https://github.com/microsoft/autogen/tree/main/python/packages/autogen-magentic-one)。在此实现中,编排器创建特定于任务的计划并将这些任务委派给可用的 Agent。除了规划之外,编排器还采用跟踪机制来监控任务的进度并根据需要重新规划。
May not be reproduced without permission:Chief AI Sharing Circle " Microsoft AI Agent Introductory Course: Planning and Design
en_USEnglish