【MetaGPT】搜集信息、撰写报告的Researcher Agent案例

# note- MetaGPT是一个多智能体的框架，比如构建成一个软件开发公司的多智能体，包括经理、产品经理、工程师等智能体@[toc]# 一、MetaGPT：The Multi-Agent Framework项目链接：https://github.com/geekan/MetaGPTMetaGPT: The Multi-Agent FrameworkMetaGPT是一个多智能体的框架：- 比如构

山顶夕景

1231人浏览 · 2024-11-03 00:48:47

山顶夕景 · 2024-11-03 00:48:47 发布

note

MetaGPT是一个多智能体的框架，比如构建成一个软件开发公司的多智能体，包括经理、产品经理、工程师等智能体
React部分里面包含了一个循环，在循环中交替执行 _think 和 _act，也就是让 llm 先 思考 再 行动。

一、MetaGPT：The Multi-Agent Framework

项目链接：https://github.com/geekan/MetaGPT
MetaGPT: The Multi-Agent Framework

MetaGPT是一个多智能体的框架：

比如构建成一个软件开发公司的多智能体，包括经理、产品经理、工程师等智能体
如输入一个需求，输出竞品分析文档、故事、数据结构等
构建Standard Operating Procedure (SOP)

在这里插入图片描述
多智能体系统的构建方法：

明确该系统需要解决的问题和目标
创建多个智能体实例（类型、数量、特性），比如分为规划设计、获取新知识、对事物/现象进行评判等的类型
定义多智能体之间的交互方式，包括协作、竞争、信息交流等，以及制定协议、策略或博弈论规则
考虑系统的可扩展性、安全性、智能体之间的异构性等
部署系统，持续监控和维护

二、Researcher Agent

调研员智能体，可以根据用户的调研问题，从搜索引擎上搜索资料并总结，然后生成调研报告。
在这里插入图片描述

类型	名称	说明
角色	Researcher	调研员智能体，从网络进行搜索并总结报告。通过LLM提示工程(Prompt Engineering)，让LLM以调研员的角色去规划和拆分任务，使用提供的工具，完成调研过程，生成调研报告。在定义角色时，会为其注册下面列出的各项工具
工具	CollectLinks	问题拆解，从搜索引擎进行搜索，并获取Url地址列表。该工具基于LLM提示工程和搜索引擎实现，其功能如下：(1) 将问题拆分成多个适合搜索的子问题(基于LLM提示工程)。(2)通过搜索引擎搜索子问题。(3)筛选出与调研问题有关的Url，并根据网站可靠性对url列表进行排序(基于LLM提示工程)
工具	WebBrowseAndSummarize	浏览网页并总结网页内容。由两个工具组成：浏览网页和总结网络内容。(1)浏览网页是通过封装的WebBrowserEngine工具访问搜索引擎实现的。(2)总结搜索结果是通过LLM提示工程实现。
工具	ConductResearch	生成调研报告。基于LLM提示工程的工具，该工具会整合WebBrowseAndSummarize的输出给到LLM，让LLM生成调研报告
记忆	short-term memory	短期记忆能力，metaGPT框架封装了短期记忆的能力，用于在任务执行周期内保存和检索上下文记忆，如CollectLinks和WebBrowseAndSummarize等工具的执行结果

基于React框架的Researcher Agent思考步骤举例：

推理1：当前知识不足以回答这个问题，要回答该问题，需要知道什么是「特斯拉FSD 」和「华为ADS」
行动1：使用搜索工具搜索「特斯拉FSD 」和「华为ADS」的资料
观察1：总结行动1的内容

推理2：基于行动1和观察1的信息，得知这是关于两个自动驾驶提供商的方案对比，基于已有的信息，现在需要生成报告
行动2：使用生成报告的工具，生成调研报告
观察2：任务完成

（1）输入调研课题：调研特斯拉 FSD 和华为 ADS 这两个自动驾驶系统

python3 -m metagpt.roles.researcher "特斯拉FSD vs 华为ADS"

（2）智能体执行调研
首先根据刚才输入的课题，根据prompt得到多个待搜索的标题，去找一些网页：

["Tesla FSD vs Huawei ADS: A Comparative Analysis of Features and Capabilities", 
"How Do Tesla FSD and Huawei ADS Address Safety and Regulation Compliance?", 
"A Review of Tesla FSD's Semi-Autonomous Driving Experience Compared to Huawei ADS's Advertising Platform", 
"Exploring the Future Development and Potential Impact of Tesla FSD and Huawei ADS on the Autonomous Driving and Advertising Industries"]

（3）对搜集到的网页进行分析、排序：

After reviewing the search results and considering the requirements for accuracy and relevance, I have filtered out the irrelevant results and ranked the remaining based on credibility and relevance. Here are the ranked results:

[1, 4, 2, 6, 3, 7, 0]

**Reasoning:**

- **Result 1** is from Counterpoint Research, which is a known and credible source for market analysis, directly comparing Huawei ADS 2.0 and Tesla Autopilot.
- **Result 4** is from a news article quoting a senior executive of Huawei, which provides a direct comparison and a statement on the rivalry between the two systems.
- **Result 2** is from a YouTube video, which may be less credible than research articles or news sources, but it directly addresses the comparison between Tesla FSD and Huawei ADS, making it relevant.
- **Result 6** is from TechInsights, which provides an in-depth analysis of Tesla's FSD features, performance, and limitations, which is relevant for a comparative analysis.
- **Result 3** is a news article explaining Tesla's FSD in the context of China, which is relevant to the topic but less specific about the comparison.
- **Result 7** is from InsideEVs, which discusses how Tesla's FSD has influenced the market in China, including the emergence of equivalent systems, which is relevant to the comparative analysis.
- **Result 0** is from MIT Technology Review, which is a credible source, but the snippet does not directly indicate a comparison between Tesla FSD and Huawei ADS, making it less relevant than the other sources.

I have omitted the other results that were not directly related to the comparison of Tesla FSD and Huawei ADS or were less credible/relevant.

（4）走function call：
工具：CollectLinks：拆解问题，搜索并返回url；WebBrowseAndSummarize：浏览网页并总结网页内容；ConductResearch：生成调研报告等。
在这里插入图片描述

其中浏览网页并总结的栗子如下：

### 特斯拉FSD与华为ADS对比分析：功能与能力

特斯拉的Full Self-Driving（FSD）和华为的Autonomous Driving Solution（ADS）都是目前自动驾驶领域的先进技术。以下是基于提供的信息对两者进行的比较分析：

**特斯拉FSD：**
- 特斯拉的FSD旨在提供完全自动驾驶的体验，其功能包括自动导航、自动变道、自动泊车以及在交通中的自主行驶。
- FSD通过使用一系列的传感器，包括摄像头、雷达和超声波传感器，来实现对周围环境的感知。
- 根据提供的信息，特斯拉的FSD在多种驾驶场景中进行了深入的分析，涵盖了其传感技术、实际驾驶行为等多个方面。
- 特斯拉的FSD技术持续通过OTA（Over-The-Air，远程升级）的方式更新，不断改进其性能和功能。

**华为ADS：**
- 华为的ADS是华为在自动驾驶领域的关键技术，其特点是强调“端到端”的自动驾驶解决方案，覆盖了从感知、决策到执行的整个过程。
- 华为ADS依赖于其强大的AI处理器和传感器系统，包括激光雷达、摄像头和毫米波雷达等，以实现高精度的环境感知和决策。
- 华为ADS在多个实际道路测试和示范项目中展示了其能力，尤其在复杂交通环境和城市道路上的表现。

**对比分析：**
- **技术成熟度：** 特斯拉FSD已经在大量的车辆上进行了部署，并通过实际使用收集了大量数据，这有助于其技术的迭代和成熟。华为ADS虽然相对较新，但华为在通信和AI领域的深厚技术积累为其提供了强大的技术支持。
- **传感器配置：** 特斯拉依赖于较为经济的传感器配置，而华为则采用了更为先进的激光雷达等传感器，理论上可以提供更高的感知精度。
- **应用范围：** 特斯拉FSD目前主要应用于乘用车市场，而华为ADS则有意在乘用车、商用车以及智慧交通等多个领域进行布局。

需要注意的是，上述分析基于提供的信息摘要，具体的技术细节和性能比较需要进一步的专业评估和实际测试数据。自动驾驶领域发展迅速，两家公司的技术也在不断进步和更新，因此对比分析也会随之变化。

三、源码分析

Researcher类继承自Role父类，Role类执行Role.run时的流程图如下：
在这里插入图片描述
最重要的部分是 React，里面包含了一个循环，在循环中交替执行 _think 和 _act，也就是让 llm 先 思考 再 行动。_think 中决定了 llm 下一个执行的 动作 是什么，这个 动作 会放到 self._rc.todo，而在 _act 中会执行 self._rc.todo 中放的 动作。放置 action obj 到 todo 是使用 _set_state。

在_think中会将一些角色信息，动作信息拼成prompt然后传给llm。
总的来说，_think就是希望通过问询llm得到一个数字，这个数字就是需要执行的动作，是一个self._actions动作列表中的索引。

1. Role._react函数

Role._react函数代码如下：

# 做了什么事？
_react有两个重要的函数：_think、_act，代表了思考和行动。他们交替运行：
	_think -> _act -> _think -> _act -> ... 
1.跟踪已经执行的动作次数，每次执行_act，则actions_taken += 1
2.在循环中，不断调用_think和_act，直到达到最大循环次数为止
	在循环中，没有待办事项时，只思考，不行动
3.返回最后一个动作的输出作为结果。

async def _react(self) -> Message:
    '''
    先思考，然后行动，直到角色认为是时候停下来了，不再需要做更多的事情。
    这是ReAct论文中标准的思考-行动循环，它在任务解决中交替思考和行动，
    即_think -> _act -> _think -> _act -> ... 
    使用llm动态地选择_think中的动作
    '''

    # 用于跟踪已经执行的动作次数
    actions_taken = 0
    rsp = Message("No actions taken yet")  # 在角色_act之后被覆盖 

    # 不断进行思考和行动，直到达到最大循环次数为止
    while actions_taken < self._rc.max_react_loop:

        # 进行思考
        await self._think()

        # 没有待办事项时，不行动
        if self._rc.todo is None:
            break

        # 进行行动
        logger.debug(f"{self._setting}: {self._rc.state=}, will do {self._rc.todo}")
        rsp = await self._act()

        # 计算行动次数
        actions_taken += 1

2. Research._act函数

async def _act(self) -> Message:
    logger.info(f"{self._setting}: to do {self.rc.todo}({self.rc.todo.name})")
    todo = self.rc.todo
    msg = self.rc.memory.get(k=1)[0]
    if isinstance(msg.instruct_content, Report):
        instruct_content = msg.instruct_content
        topic = instruct_content.topic
    else:
        topic = msg.content

    research_system_text = self.research_system_text(topic, todo)
    # 1. 从搜索引擎搜集链接
    if isinstance(todo, CollectLinks):
        links = await todo.run(topic, 4, 4)
        ret = Message(
            content="", instruct_content=Report(topic=topic, links=links), role=self.profile, cause_by=todo
        )
    # 2. 浏览网页并生成摘要
    elif isinstance(todo, WebBrowseAndSummarize):
        links = instruct_content.links
        todos = (
            todo.run(*url, query=query, system_text=research_system_text) for (query, url) in links.items() if url
        )
        if self.enable_concurrency:
            summaries = await asyncio.gather(*todos)
        else:
            summaries = [await i for i in todos]
        summaries = list((url, summary) for i in summaries for (url, summary) in i.items() if summary)
        ret = Message(
            content="", instruct_content=Report(topic=topic, summaries=summaries), role=self.profile, cause_by=todo
        )
    else:   # 3. 进行研究并生成报告
        summaries = instruct_content.summaries
        summary_text = "\n---\n".join(f"url: {url}\nsummary: {summary}" for (url, summary) in summaries)
        content = await self.rc.todo.run(topic, summary_text, system_text=research_system_text)
        ret = Message(
            content="",
            instruct_content=Report(topic=topic, content=content),
            role=self.profile,
            cause_by=self.rc.todo,
        )
    # 短期记忆（拼接工具的执行结果）
    self.rc.memory.add(ret)
    return ret