跳转至

多智能代理系统

智能代理是_使用 LLM 来决定应用程序控制流的系统_。随着你开发这些系统,它们可能会随时间变得更复杂,使其更难以管理和扩展。例如,你可能会遇到以下问题:

  • 智能代理有太多工具可供使用,并且在决定下一个调用哪个工具时做出了糟糕的决定
  • 上下文变得对单个智能代理来说太复杂而无法跟踪
  • 系统中需要多个专业领域(例如,规划器、研究员、数学专家等)

为了解决这些问题,你可以考虑将你的应用程序分解为多个较小的独立智能代理,并将它们组合成一个**多智能代理系统**。这些独立的智能代理可以像提示和 LLM 调用一样简单,也可以像 ReAct 智能代理一样复杂(甚至更多!)。

使用多智能代理系统的主要好处是:

  • 模块化:分离的智能代理使得开发、测试和维护智能代理系统更加容易。
  • 专业化:你可以创建专注于特定领域的专家智能代理,这有助于提高整体系统性能。
  • 控制:你可以明确控制智能代理如何通信(而不是依赖函数调用)。

多智能代理架构

在多智能代理系统中有几种连接智能代理的方式:

  • 网络:每个智能代理可以与其他每个智能代理通信。任何智能代理都可以决定下一个调用哪个其他智能代理。
  • 监督者:每个智能代理与单个监督者智能代理通信。监督者智能代理决定下一个应该调用哪个智能代理。
  • 监督者(工具调用):这是监督者架构的特殊情况。单个智能代理可以表示为工具。在这种情况下,监督者智能代理使用工具调用 LLM 来决定调用哪些智能代理工具,以及传递给这些智能代理的参数。
  • 层级:你可以定义一个具有监督者的监督者的多智能代理系统。这是监督者架构的泛化,允许更复杂的控制流。
  • 自定义多智能代理工作流:每个智能代理只与智能代理的子集通信。流程的某些部分是确定性的,只有某些智能代理可以决定下一个调用哪些其他智能代理。

切换

在多智能代理架构中,智能代理可以表示为图节点。每个智能代理节点执行其步骤并决定是完成执行还是路由到另一个智能代理,包括可能路由到自身(例如,在循环中运行)。多智能代理交互中的一个常见模式是**切换**,其中一个智能代理将控制权_切换_给另一个智能代理。切换允许你指定:

要在 LangGraph 中实现切换,智能代理节点可以返回 Command 对象,允许你组合控制流和状态更新:

def agent(state) -> Command[Literal["agent", "another_agent"]]:
    # the condition for routing/halting can be anything, e.g. LLM tool call / structured output, etc.
    goto = get_next_agent(...)  # 'agent' / 'another_agent'
    return Command(
        # Specify which agent to call next
        goto=goto,
        # Update the graph state
        update={"my_state_key": "my_state_value"}
    )
graph.addNode((state) => {
    // the condition for routing/halting can be anything, e.g. LLM tool call / structured output, etc.
    const goto = getNextAgent(...); // 'agent' / 'another_agent'
    return new Command({
      // Specify which agent to call next
      goto,
      // Update the graph state
      update: { myStateKey: "myStateValue" }
    });
})

在更复杂的场景中,每个智能代理节点本身是一个图(即子图),其中一个智能代理子图中的节点可能想要导航到不同的智能代理。例如,如果你有两个智能代理 alicebob(父图中的子图节点),并且 alice 需要导航到 bob,你可以在 Command 对象中设置 graph=Command.PARENT

def some_node_inside_alice(state):
    return Command(
        goto="bob",
        update={"my_state_key": "my_state_value"},
        # specify which graph to navigate to (defaults to the current graph)
        graph=Command.PARENT,
    )

在更复杂的场景中,每个智能代理节点本身是一个图(即子图),其中一个智能代理子图中的节点可能想要导航到不同的智能代理。例如,如果你有两个智能代理 alicebob(父图中的子图节点),并且 alice 需要导航到 bob,你可以在 Command 对象中设置 graph: Command.PARENT

alice.addNode((state) => {
  return new Command({
    goto: "bob",
    update: { myStateKey: "myStateValue" },
    // specify which graph to navigate to (defaults to the current graph)
    graph: Command.PARENT,
  });
});

Note

如果你需要支持使用 Command(graph=Command.PARENT) 通信的子图的可视化,你需要用带有 Command 注解的节点函数包装它们: 而不是这样:

builder.add_node(alice)

你需要这样做:

def call_alice(state) -> Command[Literal["bob"]]:
    return alice.invoke(state)

builder.add_node("alice", call_alice)

如果你需要支持使用 Command({ graph: Command.PARENT }) 通信的子图的可视化,你需要用带有 Command 注解的节点函数包装它们:

而不是这样:

builder.addNode("alice", alice);

你需要这样做:

builder.addNode("alice", (state) => alice.invoke(state), { ends: ["bob"] });

切换作为工具

最常见的智能代理类型之一是工具调用智能代理。对于这些类型的智能代理,一个常见的模式是将切换包装在工具调用中:

from langchain_core.tools import tool

@tool
def transfer_to_bob():
    """Transfer to bob."""
    return Command(
        # name of the agent (node) to go to
        goto="bob",
        # data to send to the agent
        update={"my_state_key": "my_state_value"},
        # indicate to LangGraph that we need to navigate to
        # agent node in a parent graph
        graph=Command.PARENT,
    )
import { tool } from "@langchain/core/tools";
import { Command } from "@langchain/langgraph";
import { z } from "zod";

const transferToBob = tool(
  async () => {
    return new Command({
      // name of the agent (node) to go to
      goto: "bob",
      // data to send to the agent
      update: { myStateKey: "myStateValue" },
      // indicate to LangGraph that we need to navigate to
      // agent node in a parent graph
      graph: Command.PARENT,
    });
  },
  {
    name: "transfer_to_bob",
    description: "Transfer to bob.",
    schema: z.object({}),
  }
);

这是从工具更新图状态的特殊情况,除了状态更新之外,还包括控制流。

Important

=== "Python" 如果你想使用返回 Command 的工具,你可以使用预构建的 @[create_react_agent][] / @[ToolNode][] 组件,或者实现你自己的逻辑:

def call_tools(state):
    ...
    commands = [tools_by_name[tool_call["name"]].invoke(tool_call) for tool_call in tool_calls]
    return commands

=== "JavaScript" 如果你想使用返回 Command 的工具,你可以使用预构建的 @[createReactAgent][create_react_agent] / @[ToolNode] 组件,或者实现你自己的逻辑:

graph.addNode("call_tools", async (state) => {
  // ... tool execution logic
  const commands = toolCalls.map((toolCall) =>
    toolsByName[toolCall.name].invoke(toolCall)
  );
  return commands;
});

现在让我们更详细地看看不同的多智能代理架构。

网络

在这种架构中,智能代理被定义为图节点。每个智能代理可以与其他每个智能代理通信(多对多连接),并可以决定下一个调用哪个智能代理。这种架构适合没有明确智能代理层次结构或智能代理调用特定顺序的问题。

from typing import Literal
from langchain_openai import ChatOpenAI
from langgraph.types import Command
from langgraph.graph import StateGraph, MessagesState, START, END

model = ChatOpenAI()

def agent_1(state: MessagesState) -> Command[Literal["agent_2", "agent_3", END]]:
    # you can pass relevant parts of the state to the LLM (e.g., state["messages"])
    # to determine which agent to call next. a common pattern is to call the model
    # with a structured output (e.g. force it to return an output with a "next_agent" field)
    response = model.invoke(...)
    # route to one of the agents or exit based on the LLM's decision
    # if the LLM returns "__end__", the graph will finish execution
    return Command(
        goto=response["next_agent"],
        update={"messages": [response["content"]]},
    )

def agent_2(state: MessagesState) -> Command[Literal["agent_1", "agent_3", END]]:
    response = model.invoke(...)
    return Command(
        goto=response["next_agent"],
        update={"messages": [response["content"]]},
    )

def agent_3(state: MessagesState) -> Command[Literal["agent_1", "agent_2", END]]:
    ...
    return Command(
        goto=response["next_agent"],
        update={"messages": [response["content"]]},
    )

builder = StateGraph(MessagesState)
builder.add_node(agent_1)
builder.add_node(agent_2)
builder.add_node(agent_3)

builder.add_edge(START, "agent_1")
network = builder.compile()
import { StateGraph, MessagesZodState, START, END } from "@langchain/langgraph";
import { ChatOpenAI } from "@langchain/openai";
import { Command } from "@langchain/langgraph";
import { z } from "zod";

const model = new ChatOpenAI();

const agent1 = async (state: z.infer<typeof MessagesZodState>) => {
  // you can pass relevant parts of the state to the LLM (e.g., state.messages)
  // to determine which agent to call next. a common pattern is to call the model
  // with a structured output (e.g. force it to return an output with a "next_agent" field)
  const response = await model.invoke(...);
  // route to one of the agents or exit based on the LLM's decision
  // if the LLM returns "__end__", the graph will finish execution
  return new Command({
    goto: response.nextAgent,
    update: { messages: [response.content] },
  });
};

const agent2 = async (state: z.infer<typeof MessagesZodState>) => {
  const response = await model.invoke(...);
  return new Command({
    goto: response.nextAgent,
    update: { messages: [response.content] },
  });
};

const agent3 = async (state: z.infer<typeof MessagesZodState>) => {
  // ...
  return new Command({
    goto: response.nextAgent,
    update: { messages: [response.content] },
  });
};

const builder = new StateGraph(MessagesZodState)
  .addNode("agent1", agent1, {
    ends: ["agent2", "agent3", END]
  })
  .addNode("agent2", agent2, {
    ends: ["agent1", "agent3", END]
  })
  .addNode("agent3", agent3, {
    ends: ["agent1", "agent2", END]
  })
  .addEdge(START, "agent1");

const network = builder.compile();

监督者

在这种架构中,我们将智能代理定义为节点,并添加一个监督者节点(LLM)来决定下一个应该调用哪些智能代理节点。我们使用 Command 根据监督者的决定将执行路由到适当的智能代理节点。这种架构也很适合并行运行多个智能代理或使用 map-reduce 模式。

from typing import Literal
from langchain_openai import ChatOpenAI
from langgraph.types import Command
from langgraph.graph import StateGraph, MessagesState, START, END

model = ChatOpenAI()

def supervisor(state: MessagesState) -> Command[Literal["agent_1", "agent_2", END]]:
    # you can pass relevant parts of the state to the LLM (e.g., state["messages"])
    # to determine which agent to call next. a common pattern is to call the model
    # with a structured output (e.g. force it to return an output with a "next_agent" field)
    response = model.invoke(...)
    # route to one of the agents or exit based on the supervisor's decision
    # if the supervisor returns "__end__", the graph will finish execution
    return Command(goto=response["next_agent"])

def agent_1(state: MessagesState) -> Command[Literal["supervisor"]]:
    # you can pass relevant parts of the state to the LLM (e.g., state["messages"])
    # and add any additional logic (different models, custom prompts, structured output, etc.)
    response = model.invoke(...)
    return Command(
        goto="supervisor",
        update={"messages": [response]},
    )

def agent_2(state: MessagesState) -> Command[Literal["supervisor"]]:
    response = model.invoke(...)
    return Command(
        goto="supervisor",
        update={"messages": [response]},
    )

builder = StateGraph(MessagesState)
builder.add_node(supervisor)
builder.add_node(agent_1)
builder.add_node(agent_2)

builder.add_edge(START, "supervisor")

supervisor = builder.compile()
import { StateGraph, MessagesZodState, Command, START, END } from "@langchain/langgraph";
import { ChatOpenAI } from "@langchain/openai";
import { z } from "zod";

const model = new ChatOpenAI();

const supervisor = async (state: z.infer<typeof MessagesZodState>) => {
  // you can pass relevant parts of the state to the LLM (e.g., state.messages)
  // to determine which agent to call next. a common pattern is to call the model
  // with a structured output (e.g. force it to return an output with a "next_agent" field)
  const response = await model.invoke(...);
  // route to one of the agents or exit based on the supervisor's decision
  // if the supervisor returns "__end__", the graph will finish execution
  return new Command({ goto: response.nextAgent });
};

const agent1 = async (state: z.infer<typeof MessagesZodState>) => {
  // you can pass relevant parts of the state to the LLM (e.g., state.messages)
  // and add any additional logic (different models, custom prompts, structured output, etc.)
  const response = await model.invoke(...);
  return new Command({
    goto: "supervisor",
    update: { messages: [response] },
  });
};

const agent2 = async (state: z.infer<typeof MessagesZodState>) => {
  const response = await model.invoke(...);
  return new Command({
    goto: "supervisor",
    update: { messages: [response] },
  });
};

const builder = new StateGraph(MessagesZodState)
  .addNode("supervisor", supervisor, {
    ends: ["agent1", "agent2", END]
  })
  .addNode("agent1", agent1, {
    ends: ["supervisor"]
  })
  .addNode("agent2", agent2, {
    ends: ["supervisor"]
  })
  .addEdge(START, "supervisor");

const supervisorGraph = builder.compile();
import { StateGraph, MessagesZodState, Command, START, END } from "@langchain/langgraph";
import { ChatOpenAI } from "@langchain/openai";
import { z } from "zod";

const model = new ChatOpenAI();

const supervisor = async (state: z.infer<typeof MessagesZodState>) => {
  // you can pass relevant parts of the state to the LLM (e.g., state.messages)
  // to determine which agent to call next. a common pattern is to call the model
  // with a structured output (e.g. force it to return an output with a "next_agent" field)
  const response = await model.invoke(...);
  // route to one of the agents or exit based on the supervisor's decision
  // if the supervisor returns "__end__", the graph will finish execution
  return new Command({ goto: response.nextAgent });
};

const agent1 = async (state: z.infer<typeof MessagesZodState>) => {
  // you can pass relevant parts of the state to the LLM (e.g., state.messages)
  // and add any additional logic (different models, custom prompts, structured output, etc.)
  const response = await model.invoke(...);
  return new Command({
    goto: "supervisor",
    update: { messages: [response] },
  });
};

const agent2 = async (state: z.infer<typeof MessagesZodState>) => {
  const response = await model.invoke(...);
  return new Command({
    goto: "supervisor",
    update: { messages: [response] },
  });
};

const builder = new StateGraph(MessagesZodState)
  .addNode("supervisor", supervisor, {
    ends: ["agent1", "agent2", END]
  })
  .addNode("agent1", agent1, {
    ends: ["supervisor"]
  })
  .addNode("agent2", agent2, {
    ends: ["supervisor"]
  })
  .addEdge(START, "supervisor");

const supervisorGraph = builder.compile();

查看此教程了解监督者多智能代理架构的示例。

监督者(工具调用)

监督者架构的这种变体中,我们定义一个监督者智能代理,负责调用子智能代理。子智能代理作为工具暴露给监督者,监督者智能代理决定下一个调用哪个工具。监督者智能代理遵循标准实现,即在循环中运行的 LLM 调用工具直到它决定停止。

from typing import Annotated
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import InjectedState, create_react_agent

model = ChatOpenAI()

# this is the agent function that will be called as tool
# notice that you can pass the state to the tool via InjectedState annotation
def agent_1(state: Annotated[dict, InjectedState]):
    # you can pass relevant parts of the state to the LLM (e.g., state["messages"])
    # and add any additional logic (different models, custom prompts, structured output, etc.)
    response = model.invoke(...)
    # return the LLM response as a string (expected tool response format)
    # this will be automatically turned to ToolMessage
    # by the prebuilt create_react_agent (supervisor)
    return response.content

def agent_2(state: Annotated[dict, InjectedState]):
    response = model.invoke(...)
    return response.content

tools = [agent_1, agent_2]
# the simplest way to build a supervisor w/ tool-calling is to use prebuilt ReAct agent graph
# that consists of a tool-calling LLM node (i.e. supervisor) and a tool-executing node
supervisor = create_react_agent(model, tools)
import { ChatOpenAI } from "@langchain/openai";
import { createReactAgent } from "@langchain/langgraph/prebuilt";
import { tool } from "@langchain/core/tools";
import { z } from "zod";

const model = new ChatOpenAI();

// this is the agent function that will be called as tool
// notice that you can pass the state to the tool via config parameter
const agent1 = tool(
  async (_, config) => {
    const state = config.configurable?.state;
    // you can pass relevant parts of the state to the LLM (e.g., state.messages)
    // and add any additional logic (different models, custom prompts, structured output, etc.)
    const response = await model.invoke(...);
    // return the LLM response as a string (expected tool response format)
    // this will be automatically turned to ToolMessage
    // by the prebuilt createReactAgent (supervisor)
    return response.content;
  },
  {
    name: "agent1",
    description: "Agent 1 description",
    schema: z.object({}),
  }
);

const agent2 = tool(
  async (_, config) => {
    const state = config.configurable?.state;
    const response = await model.invoke(...);
    return response.content;
  },
  {
    name: "agent2",
    description: "Agent 2 description",
    schema: z.object({}),
  }
);

const tools = [agent1, agent2];
// the simplest way to build a supervisor w/ tool-calling is to use prebuilt ReAct agent graph
// that consists of a tool-calling LLM node (i.e. supervisor) and a tool-executing node
const supervisor = createReactAgent({ llm: model, tools });

层级

随着你向系统添加更多智能代理,监督者可能变得难以管理所有智能代理。监督者可能开始对下一个调用哪个智能代理做出糟糕的决定,或者上下文可能变得对单个监督者来说太复杂而无法跟踪。换句话说,你最终会遇到最初促使多智能代理架构出现的同样问题。

为了解决这个问题,你可以_层级化_设计你的系统。例如,你可以创建由各自监督者管理的独立专业智能代理团队,以及一个顶层监督者来管理这些团队。

from typing import Literal
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, MessagesState, START, END
from langgraph.types import Command
model = ChatOpenAI()

# define team 1 (same as the single supervisor example above)

def team_1_supervisor(state: MessagesState) -> Command[Literal["team_1_agent_1", "team_1_agent_2", END]]:
    response = model.invoke(...)
    return Command(goto=response["next_agent"])

def team_1_agent_1(state: MessagesState) -> Command[Literal["team_1_supervisor"]]:
    response = model.invoke(...)
    return Command(goto="team_1_supervisor", update={"messages": [response]})

def team_1_agent_2(state: MessagesState) -> Command[Literal["team_1_supervisor"]]:
    response = model.invoke(...)
    return Command(goto="team_1_supervisor", update={"messages": [response]})

team_1_builder = StateGraph(Team1State)
team_1_builder.add_node(team_1_supervisor)
team_1_builder.add_node(team_1_agent_1)
team_1_builder.add_node(team_1_agent_2)
team_1_builder.add_edge(START, "team_1_supervisor")
team_1_graph = team_1_builder.compile()

# define team 2 (same as the single supervisor example above)
class Team2State(MessagesState):
    next: Literal["team_2_agent_1", "team_2_agent_2", "__end__"]

def team_2_supervisor(state: Team2State):
    ...

def team_2_agent_1(state: Team2State):
    ...

def team_2_agent_2(state: Team2State):
    ...

team_2_builder = StateGraph(Team2State)
...
team_2_graph = team_2_builder.compile()


# define top-level supervisor

builder = StateGraph(MessagesState)
def top_level_supervisor(state: MessagesState) -> Command[Literal["team_1_graph", "team_2_graph", END]]:
    # you can pass relevant parts of the state to the LLM (e.g., state["messages"])
    # to determine which team to call next. a common pattern is to call the model
    # with a structured output (e.g. force it to return an output with a "next_team" field)
    response = model.invoke(...)
    # route to one of the teams or exit based on the supervisor's decision
    # if the supervisor returns "__end__", the graph will finish execution
    return Command(goto=response["next_team"])

builder = StateGraph(MessagesState)
builder.add_node(top_level_supervisor)
builder.add_node("team_1_graph", team_1_graph)
builder.add_node("team_2_graph", team_2_graph)
builder.add_edge(START, "top_level_supervisor")
builder.add_edge("team_1_graph", "top_level_supervisor")
builder.add_edge("team_2_graph", "top_level_supervisor")
graph = builder.compile()
import { StateGraph, MessagesZodState, Command, START, END } from "@langchain/langgraph";
import { ChatOpenAI } from "@langchain/openai";
import { z } from "zod";

const model = new ChatOpenAI();

// define team 1 (same as the single supervisor example above)

const team1Supervisor = async (state: z.infer<typeof MessagesZodState>) => {
  const response = await model.invoke(...);
  return new Command({ goto: response.nextAgent });
};

const team1Agent1 = async (state: z.infer<typeof MessagesZodState>) => {
  const response = await model.invoke(...);
  return new Command({
    goto: "team1Supervisor",
    update: { messages: [response] }
  });
};

const team1Agent2 = async (state: z.infer<typeof MessagesZodState>) => {
  const response = await model.invoke(...);
  return new Command({
    goto: "team1Supervisor",
    update: { messages: [response] }
  });
};

const team1Builder = new StateGraph(MessagesZodState)
  .addNode("team1Supervisor", team1Supervisor, {
    ends: ["team1Agent1", "team1Agent2", END]
  })
  .addNode("team1Agent1", team1Agent1, {
    ends: ["team1Supervisor"]
  })
  .addNode("team1Agent2", team1Agent2, {
    ends: ["team1Supervisor"]
  })
  .addEdge(START, "team1Supervisor");
const team1Graph = team1Builder.compile();

// define team 2 (same as the single supervisor example above)
const team2Supervisor = async (state: z.infer<typeof MessagesZodState>) => {
  // ...
};

const team2Agent1 = async (state: z.infer<typeof MessagesZodState>) => {
  // ...
};

const team2Agent2 = async (state: z.infer<typeof MessagesZodState>) => {
  // ...
};

const team2Builder = new StateGraph(MessagesZodState);
// ... build team2Graph
const team2Graph = team2Builder.compile();

// define top-level supervisor

const topLevelSupervisor = async (state: z.infer<typeof MessagesZodState>) => {
  // you can pass relevant parts of the state to the LLM (e.g., state.messages)
  // to determine which team to call next. a common pattern is to call the model
  // with a structured output (e.g. force it to return an output with a "next_team" field)
  const response = await model.invoke(...);
  // route to one of the teams or exit based on the supervisor's decision
  // if the supervisor returns "__end__", the graph will finish execution
  return new Command({ goto: response.nextTeam });
};

const builder = new StateGraph(MessagesZodState)
  .addNode("topLevelSupervisor", topLevelSupervisor, {
    ends: ["team1Graph", "team2Graph", END]
  })
  .addNode("team1Graph", team1Graph)
  .addNode("team2Graph", team2Graph)
  .addEdge(START, "topLevelSupervisor")
  .addEdge("team1Graph", "topLevelSupervisor")
  .addEdge("team2Graph", "topLevelSupervisor");

const graph = builder.compile();

自定义多智能代理工作流

在这种架构中,我们将单个智能代理添加为图节点,并提前定义智能代理调用的顺序,形成自定义工作流。在 LangGraph 中,工作流可以通过两种方式定义:

  • 显式控制流(普通边):LangGraph 允许你通过普通图边显式定义应用程序的控制流(即智能代理通信的顺序)。这是上述架构中最确定性的变体——我们总是提前知道下一个将调用哪个智能代理。

  • 动态控制流(Command):在 LangGraph 中,你可以允许 LLM 决定应用程序控制流的部分。这可以通过使用 Command 来实现。这种情况的一个特殊情况是监督者工具调用架构。在这种情况下,驱动监督者智能代理的工具调用 LLM 将决定调用工具(智能代理)的顺序。

from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, MessagesState, START

model = ChatOpenAI()

def agent_1(state: MessagesState):
    response = model.invoke(...)
    return {"messages": [response]}

def agent_2(state: MessagesState):
    response = model.invoke(...)
    return {"messages": [response]}

builder = StateGraph(MessagesState)
builder.add_node(agent_1)
builder.add_node(agent_2)
# define the flow explicitly
builder.add_edge(START, "agent_1")
builder.add_edge("agent_1", "agent_2")
import { StateGraph, MessagesZodState, START } from "@langchain/langgraph";
import { ChatOpenAI } from "@langchain/openai";
import { z } from "zod";

const model = new ChatOpenAI();

const agent1 = async (state: z.infer<typeof MessagesZodState>) => {
  const response = await model.invoke(...);
  return { messages: [response] };
};

const agent2 = async (state: z.infer<typeof MessagesZodState>) => {
  const response = await model.invoke(...);
  return { messages: [response] };
};

const builder = new StateGraph(MessagesZodState)
  .addNode("agent1", agent1)
  .addNode("agent2", agent2)
  // define the flow explicitly
  .addEdge(START, "agent1")
  .addEdge("agent1", "agent2");

通信和状态管理

构建多智能代理系统时最重要的事情是弄清楚智能代理如何通信。

智能代理通信的一种常见通用方式是通过消息列表。这引出了以下问题:

此外,如果你处理更复杂的智能代理或希望将单个智能代理状态与多智能代理系统状态分开,你可能需要使用不同的状态模式

切换与工具调用

智能代理之间传递的"有效载荷"是什么?在上面讨论的大多数架构中,智能代理通过切换通信,并将图状态作为切换有效载荷的一部分传递。具体来说,智能代理将消息列表作为图状态的一部分传递。在工具调用监督者的情况下,有效载荷是工具调用参数。

智能代理之间的消息传递

智能代理通信最常见的方式是通过共享状态通道,通常是消息列表。这假设始终至少有一个通道(键)在智能代理之间共享(例如,messages)。当通过共享消息列表通信时,还有一个额外的考虑:智能代理应该共享其思考过程的完整历史还是只共享最终结果

共享完整思考过程

智能代理可以与所有其他智能代理**共享其思考过程的完整历史**(即"草稿本")。这个"草稿本"通常看起来像消息列表。共享完整思考过程的好处是它可能帮助其他智能代理做出更好的决定,并提高整个系统的推理能力。缺点是随着智能代理数量和复杂性的增加,"草稿本"会快速增长,可能需要额外的记忆管理策略。

只共享最终结果

智能代理可以有自己的私有"草稿本",只与其他智能代理**共享最终结果**。这种方法可能更适合有很多智能代理或更复杂智能代理的系统。在这种情况下,你需要定义具有不同状态模式的智能代理。

对于作为工具调用的智能代理,监督者根据工具模式确定输入。此外,LangGraph 允许在运行时向单个工具传递状态,因此如果需要,下级智能代理可以访问父状态。

在消息中指示智能代理名称

指示特定 AI 消息来自哪个智能代理可能很有帮助,特别是对于长消息历史。一些 LLM 提供商(如 OpenAI)支持向消息添加 name 参数——你可以使用它将智能代理名称附加到消息。如果不支持,你可以考虑手动将智能代理名称注入消息内容中,例如,<agent>alice</agent><message>message from alice</message>

在消息历史中表示切换

切换通常通过 LLM 调用专用切换工具来完成。这表示为带有工具调用的 AI 消息,传递给下一个智能代理(LLM)。大多数 LLM 提供商不支持接收带有工具调用的 AI 消息**而没有**相应的工具消息。

切换通常通过 LLM 调用专用切换工具来完成。这表示为带有工具调用的 AI 消息,传递给下一个智能代理(LLM)。大多数 LLM 提供商不支持接收带有工具调用的 AI 消息**而没有**相应的工具消息。

因此你有两个选择:

  1. 向消息列表添加额外的工具消息,例如,"Successfully transferred to agent X"
  2. 删除带有工具调用的 AI 消息
  1. 向消息列表添加额外的工具消息,例如,"Successfully transferred to agent X"
  2. 删除带有工具调用的 AI 消息

在实践中,我们看到大多数开发者选择选项(1)。

子智能代理的状态管理

一种常见做法是让多个智能代理在共享消息列表上通信,但只将它们的最终消息添加到列表。这意味着任何中间消息(例如,工具调用)都不会保存在此列表中。

如果你**确实**想保存这些消息,以便将来调用此特定子智能代理时可以将它们传递回来,该怎么办?

有两种高级方法可以实现:

  1. 将这些消息存储在共享消息列表中,但在传递给子智能代理 LLM 之前过滤列表。例如,你可以选择过滤掉**其他**智能代理的所有工具调用。
  2. 在子智能代理的图状态中为每个智能代理存储单独的消息列表(例如,alice_messages)。这将是它们对消息历史的"视图"。
  1. 将这些消息存储在共享消息列表中,但在传递给子智能代理 LLM 之前过滤列表。例如,你可以选择过滤掉**其他**智能代理的所有工具调用。
  2. 在子智能代理的图状态中为每个智能代理存储单独的消息列表(例如,aliceMessages)。这将是它们对消息历史的"视图"。

使用不同的状态模式

智能代理可能需要与其他智能代理不同的状态模式。例如,搜索智能代理可能只需要跟踪查询和检索到的文档。在 LangGraph 中有两种方法可以实现:

  • 定义具有单独状态模式的子图智能代理。如果子图和父图之间没有共享状态键(通道),重要的是添加输入/输出转换,以便父图知道如何与子图通信。
  • 定义具有与整体图状态模式不同的私有输入状态模式的智能代理节点函数。这允许传递只需要用于执行该特定智能代理的信息。