工程经验 - Agent 工程之 ReAct 实现与思考

Warren Zhan

2025-02-20

2025-03-24

工作沉思录

Agent工程, 最佳实践, AI, 技术, 大模型, 工程经验

123 34~44 min

LLM 自变成风口以来，每天都有很多奇思妙想，其中有一些想法虽然也没有发顶刊，却也非常重要，直到今天还是为人们所津津乐道，例如 COT、 ReAct 都成为了现阶段 Agent 工程中重要的基石。ReAct 的核心在于如何让 LLM 长出手脚，去做事情，属于 Tool Use 领域的一种方案。
原理已经在论文有了，不作详细阐述。本文介绍基于 ReAct 的手动实现、业界的成熟框架实现，去写一个 Hello World Agent。目的是更具象化地理解 ReAct 的作用。

一、论文

ReAct: Synergizing Reasoning and Acting in Language Models

简单来说：

提供 Tools
设计一个 Workflow 框架，包含 Question、Thought、Action、Action Input、Observation 等步骤
在 Workflow 中持续的重复调用、思考、观察 , 直到最后得到结果

graph TD Start((Start)) --> Question[输入 Question/Task] Question --> Thought[生成 Thought] Thought --> Action[生成 Action Input] Action --> Decision{"判断 Action 'Finish' 了么？"} Decision -->|Yes| Output[输出 Answer] Output --> End((End)) Decision -->|No| Perform[执行 Action] Perform --> Observation[输出 Observation] Observation --> Thought

二、SHOW ME THE CODE

2.1 基于论文最简实现的 Agent

sequenceDiagram participant User participant LLM participant Tool User->>LLM: Input Question Note over LLM: Parse Question into ReAct Format loop Think & Act Process LLM->>LLM: Generate Thought alt Needs Tool LLM->>Tool: Action & ActionInput Tool->>LLM: Return Observation LLM->>LLM: Process Observation else No Tool Needed LLM->>LLM: Direct Thinking end end LLM->>LLM: Generate Final Thought LLM->>User: Return FinalAnswer

具体代码放在我的 Gayhub: https://github.com/jalr4ever/Tiny-OAI-Agent

2.2 基于 Langchain 最简实现的 Agent

成熟的实现是 Langchain 提供的 Agent 实现，Langchain 提供了 ReAct 以及Plan-and-Execute 两种模式用于实现 Agent。

"""
1 Load Model
2 Define Tool Function
3 Define Tools
4 Define Prompt
5 Crate ToolCallingAgent With (model, tools, prompt)
6 Create AgentExecutor With (agent, tools)
7 Using AgentExecutor invoke query
"""
import os

from dotenv import load_dotenv
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI

load_dotenv()

model = ChatOpenAI(model="gpt-4o-mini", base_url=os.getenv("OPENAI_API_BASE"))


@tool
def magic_function(i: int) -> int:
    """Applies a magic function to an input."""
    return i + 2


tools = [magic_function]

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant"),
        ("human", "{input}"),
        # Placeholders fill up a **list** of messages
        ("placeholder", "{agent_scratchpad}"),
    ]
)

agent = create_tool_calling_agent(model, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools)
query = "what is the value of magic_function(3)?"
d = agent_executor.invoke({"input": query})
print(d)

输出：

{'input': 'what is the value of magic_function(3)?', 'output': 'The value of `magic_function(3)` is 5.'}

2.3 基于 Langchain 最简实现的 Agent

Langchain 团队搞了新的框架，其建议大家使用 Langgraph 实现 Agent。

"""
LangGraph's react agent executor manages a state that is defined by a list of messages.
It will continue to process the list until there are no tool calls in the agent's output.

1 Define model
2 Define tool function and turn to tools
3 Define Prompt
4 Define langgraph_agent_executor  by(model, tools, prompt=prompt)
5 Invoke by langgraph_agent_executor, invoke({"messages": [("human", query)]})
"""
import os

from dotenv import load_dotenv
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI

load_dotenv()

model = ChatOpenAI(model="gpt-4o-mini", base_url=os.getenv("OPENAI_API_BASE"))
from langgraph.prebuilt import create_react_agent


@tool
def magic_function(i: int) -> int:
    """Applies a magic function to an input."""
    return i + 2


tools = [magic_function]
query = "what is the value of magic_function(3)?"

langgraph_agent_executor = create_react_agent(model, tools)

messages = langgraph_agent_executor.invoke({"messages": [("human", query)]})
d = {
    "input": query,
    "output": messages["messages"][-1].content,
}
print(d)

message_history = messages["messages"]

new_query = "Pardon?"

messages = langgraph_agent_executor.invoke(
    # Pass in the previous messages by []
    {"messages": message_history + [("human", new_query)]}
)
d = {
    "input": new_query,
    "output": messages["messages"][-1].content,
}
print(d)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant. Respond only in Chinese."),
        ("placeholder", "{messages}"),
        ("user", "Also say 'Pandamonium!' after the answer."),
    ]
)
langgraph_agent_executor = create_react_agent(model, tools, prompt=prompt)

messages = langgraph_agent_executor.invoke({"messages": [("human", query)]})
print(
    {
        "input": query,
        "output": messages["messages"][-1].content,
    }
)

输出：

{'input': 'what is the value of magic_function(3)?', 'output': 'The value of `magic_function(3)` is 5.'}
{'input': 'Pardon?', 'output': 'The result of calling `magic_function(3)` is 5. If you have any further questions or need additional information, feel free to ask!'}
{'input': 'what is the value of magic_function(3)?', 'output': '魔法函数的值是 5！Pandamonium!'}

2.4 基于 smolagents 最简实现的 Agent

这个是 HuggingFace 团队做的框架，简单易用

import os
from typing import Optional

from dotenv import load_dotenv
from smolagents import HfApiModel, LiteLLMModel, TransformersModel, tool
from smolagents.agents import CodeAgent, ToolCallingAgent

load_dotenv()

# Choose which inference type to use!

available_inferences = ["hf_api", "transformers", "ollama", "litellm"]
chosen_inference = "litellm"

print(f"Chose model: '{chosen_inference}'")

if chosen_inference == "hf_api":
    model = HfApiModel(model_id="meta-llama/Llama-3.3-70B-Instruct")

elif chosen_inference == "transformers":
    model = TransformersModel(model_id="HuggingFaceTB/SmolLM2-1.7B-Instruct", device_map="auto", max_new_tokens=1000)

elif chosen_inference == "ollama":
    model = LiteLLMModel(
        model_id="ollama_chat/llama3.2",
        api_base="http://localhost:11434",  # replace with remote open-ai compatible server if necessary
        api_key="your-api-key",  # replace with API key if necessary
        num_ctx=8192,  # ollama default is 2048 which will often fail horribly. 8192 works for easy tasks, more is better. Check https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator to calculate how much VRAM this will need for the selected model.
    )

elif chosen_inference == "litellm":
    # For anthropic: change model_id below to 'anthropic/claude-3-5-sonnet-latest'
    model = LiteLLMModel(model_id="gpt-4o-mini",api_key=os.getenv("OPENAI_API_KEY"), api_base=os.getenv("OPENAI_API_BASE"))


@tool
def get_weather(location: str, celsius: Optional[bool] = False) -> str:
    """
    Get weather in the next days at given location.
    Secretly this tool does not care about the location, it hates the weather everywhere.

    Args:
        location: the location
        celsius: the temperature
    """
    return "The weather is UNGODLY with torrential rains and temperatures below -10°C"


agent = ToolCallingAgent(tools=[get_weather], model=model)

print("ToolCallingAgent:", agent.run("What's the weather like in Paris?"))

print("\n")

agent = CodeAgent(tools=[get_weather], model=model)

print("CodeAgent:", agent.run("What's the weather like in Paris?"))

输出：

Chose model: 'litellm'
╭────────────────────────────────── New run ───────────────────────────────────╮
│                                                                              │
│ What's the weather like in Paris?                                            │
│                                                                              │
╰─ LiteLLMModel - gpt-4o-mini ─────────────────────────────────────────────────╯
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
╭──────────────────────────────────────────────────────────────────────────────╮
│ Calling tool: 'get_weather' with arguments: {'location': 'Paris'}            │
╰──────────────────────────────────────────────────────────────────────────────╯
Observations: The weather is UNGODLY with torrential rains and temperatures 
below -10°C
[Step 0: Duration 2.39 seconds| Input tokens: 1,003 | Output tokens: 12]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
╭──────────────────────────────────────────────────────────────────────────────╮
│ Calling tool: 'get_weather' with arguments: {'location': 'Paris', 'celsius': │
│ True}                                                                        │
╰──────────────────────────────────────────────────────────────────────────────╯
Observations: The weather is UNGODLY with torrential rains and temperatures 
below -10°C
[Step 1: Duration 1.16 seconds| Input tokens: 2,104 | Output tokens: 64]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 3 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
╭──────────────────────────────────────────────────────────────────────────────╮
│ Calling tool: 'final_answer' with arguments: {'answer': 'The weather in      │
│ Paris is UNGODLY with torrential rains and temperatures below -10°C.'}       │
╰──────────────────────────────────────────────────────────────────────────────╯
Final answer: The weather in Paris is UNGODLY with torrential rains and 
temperatures below -10°C.
[Step 2: Duration 0.85 seconds| Input tokens: 3,313 | Output tokens: 95]
ToolCallingAgent: The weather in Paris is UNGODLY with torrential rains and temperatures below -10°C.


╭────────────────────────────────── New run ───────────────────────────────────╮
│                                                                              │
│ What's the weather like in Paris?                                            │
│                                                                              │
╰─ LiteLLMModel - gpt-4o-mini ─────────────────────────────────────────────────╯
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 ─ Executing parsed code: ───────────────────────────────────────────────────── 
  weather_info = get_weather(location="Paris", celsius=True)                    
  print(weather_info)                                                           
 ────────────────────────────────────────────────────────────────────────────── 
Execution logs:
The weather is UNGODLY with torrential rains and temperatures below -10°C

Out: None
[Step 0: Duration 1.53 seconds| Input tokens: 2,039 | Output tokens: 72]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 ─ Executing parsed code: ───────────────────────────────────────────────────── 
  final_answer("UNGODLY with torrential rains and temperatures below -10°C")    
 ────────────────────────────────────────────────────────────────────────────── 
Out - Final answer: UNGODLY with torrential rains and temperatures below -10°C
[Step 1: Duration 1.06 seconds| Input tokens: 4,248 | Output tokens: 130]
CodeAgent: UNGODLY with torrential rains and temperatures below -10°C

三、总结

最开始研究 ReAct 是了解 Agent 工程中，如何设计提供 Tool 才是最好的方案？在这个过程中，本文的所有工作，主要回答以下相关问题

3.1 ReAct 是什么？

从 smolagents ：https://huggingface.co/docs/smolagents/conceptual_guides/react 的文章中可以看到，它认为ReAct 是目前构建 multi-step agents 的主要方法。

“Reason”和“Act”，为 ReAct，遵循这种设计的 Agent，将根据需要解决的 Task，形成 Reason 和 Act，它为 Act 制定 Tool Use，以使其不断解决 Task 涉及的问题，最后完全解决 Task，拿到一个 Final Answer。

3.2 ReAct 和 Function Calling 有关系么？

可以看到，Langgraph 在做 Tool Use 的时候，其实没有 Function Calling 声明的步骤，其内部用的是基于 ReAct 模式构建出来的一个工作流（见本文第一章节）

目的角度，可以理解为 ReAct 和 Function Calling 是“友商”，目的都是为了让 LLM 做 Tool Use：

前者是 Prompt 工程维度（System Prompt 做功夫）。通过 Action 能做 Tool Use。
后者是模型维度（训练做功夫，可参考 ollama 和 qwen 论文，指令遵循相关）。通过 Function Calling 机制能做 Tool Use。

任务完成角度，ReAct 的“业务”可比 Function Calling 要广泛：

前者设计了 Though、Observation 等内容，不断去完成任务，Tool Use 只是其中的一个环节。
后者只是一种 Tool Use 的机制，不提供任务规划等“业务”。

3.3 ReAct 过时了么？

不止没有过时，现在已经成为 Tool Use 领域的基石，而且和 MCP、Function Calling 也可以做结合。