14.4 深度推理Agent (Deep Research)¶
本章目标:
- 理解Deep Research的原理和优势
- 掌握长推理链的管理方法
- 实现自我验证机制
- 构建完整的深度推理Agent
什么是Deep Research?¶
Deep Research是一种类似于OpenAI o1模型的推理模式,通过以下特点实现深度推理:
- 长推理链: 10+轮逐步推理
- 自我验证: 每步都进行严格验证
- 主动补充: 主动发现并补充缺失信息
- 可解释性: 输出完整的推理链
# 对比不同Agent模式
# ReAct: 快速但不够深入
ReAct: Thought → Action → Observation (1-3轮)
# Self-Reflection: 有反思但深度有限
Reflection: Execute → Reflect → Improve (2-3轮)
# Deep Research: 深度思考
Deep Research: Reason → Verify → Retrieve → Reason (10+轮)
1. 核心数据结构¶
首先定义推理步骤和验证结果的数据结构。
In [ ]:
Copied!
from typing import List, Dict, Any, Optional
import json
from dataclasses import dataclass, field
from datetime import datetime
@dataclass
class ReasoningStep:
"""
推理步骤数据类
Attributes:
step_id: 步骤编号
content: 推理内容
evidence: 支持证据
confidence: 置信度 (0-1)
questions: 产生的疑问
dependencies: 依赖的前置步骤
"""
step_id: int
content: str
evidence: List[str] = field(default_factory=list)
confidence: float = 0.5
questions: List[str] = field(default_factory=list)
dependencies: List[int] = field(default_factory=list)
timestamp: str = field(default_factory=lambda: datetime.now().isoformat())
def to_dict(self) -> Dict:
"""转换为字典"""
return {
"step_id": self.step_id,
"content": self.content,
"evidence": self.evidence,
"confidence": self.confidence,
"questions": self.questions,
"dependencies": self.dependencies,
"timestamp": self.timestamp
}
# 示例: 创建一个推理步骤
step1 = ReasoningStep(
step_id=1,
content="首先,我需要理解问题的核心...",
evidence=["根据问题描述...", "从已知条件..."],
confidence=0.7,
questions=["是否还有其他隐藏条件?"]
)
print("推理步骤:")
print(f" ID: {step1.step_id}")
print(f" 内容: {step1.content}")
print(f" 置信度: {step1.confidence:.2%}")
print(f" 疑问: {step1.questions}")
from typing import List, Dict, Any, Optional
import json
from dataclasses import dataclass, field
from datetime import datetime
@dataclass
class ReasoningStep:
"""
推理步骤数据类
Attributes:
step_id: 步骤编号
content: 推理内容
evidence: 支持证据
confidence: 置信度 (0-1)
questions: 产生的疑问
dependencies: 依赖的前置步骤
"""
step_id: int
content: str
evidence: List[str] = field(default_factory=list)
confidence: float = 0.5
questions: List[str] = field(default_factory=list)
dependencies: List[int] = field(default_factory=list)
timestamp: str = field(default_factory=lambda: datetime.now().isoformat())
def to_dict(self) -> Dict:
"""转换为字典"""
return {
"step_id": self.step_id,
"content": self.content,
"evidence": self.evidence,
"confidence": self.confidence,
"questions": self.questions,
"dependencies": self.dependencies,
"timestamp": self.timestamp
}
# 示例: 创建一个推理步骤
step1 = ReasoningStep(
step_id=1,
content="首先,我需要理解问题的核心...",
evidence=["根据问题描述...", "从已知条件..."],
confidence=0.7,
questions=["是否还有其他隐藏条件?"]
)
print("推理步骤:")
print(f" ID: {step1.step_id}")
print(f" 内容: {step1.content}")
print(f" 置信度: {step1.confidence:.2%}")
print(f" 疑问: {step1.questions}")
2. 验证结果数据结构¶
In [ ]:
Copied!
from dataclasses import dataclass, field
@dataclass
class VerificationResult:
"""
验证结果数据类
Attributes:
is_valid: 是否通过验证
confidence: 整体置信度
issues: 发现的问题
missing_info: 缺失的信息
improvements: 改进建议
"""
is_valid: bool
confidence: float
issues: List[str] = field(default_factory=list)
missing_info: List[str] = field(default_factory=list)
improvements: List[str] = field(default_factory=list)
# 示例: 创建一个验证结果
verification = VerificationResult(
is_valid=True,
confidence=0.75,
issues=["逻辑稍有跳跃"],
missing_info=["需要更多数据支持"],
improvements=["可以增加验证步骤"]
)
print("验证结果:")
print(f" 通过: {verification.is_valid}")
print(f" 置信度: {verification.confidence:.2%}")
print(f" 问题: {verification.issues}")
print(f" 缺失信息: {verification.missing_info}")
from dataclasses import dataclass, field
@dataclass
class VerificationResult:
"""
验证结果数据类
Attributes:
is_valid: 是否通过验证
confidence: 整体置信度
issues: 发现的问题
missing_info: 缺失的信息
improvements: 改进建议
"""
is_valid: bool
confidence: float
issues: List[str] = field(default_factory=list)
missing_info: List[str] = field(default_factory=list)
improvements: List[str] = field(default_factory=list)
# 示例: 创建一个验证结果
verification = VerificationResult(
is_valid=True,
confidence=0.75,
issues=["逻辑稍有跳跃"],
missing_info=["需要更多数据支持"],
improvements=["可以增加验证步骤"]
)
print("验证结果:")
print(f" 通过: {verification.is_valid}")
print(f" 置信度: {verification.confidence:.2%}")
print(f" 问题: {verification.issues}")
print(f" 缺失信息: {verification.missing_info}")
3. Deep Research Agent核心实现¶
实现完整的深度推理Agent类。
In [ ]:
Copied!
class DeepResearchAgent:
"""
Deep Research Agent
特点:
1. 长推理链 (10+轮)
2. 每步自我验证
3. 主动补充检索
4. 推理链可视化
5. 多角度自我质疑
Args:
llm: LLM实例
retriever: 检索器实例
max_rounds: 最大推理轮数 (默认10)
confidence_threshold: 置信度阈值 (默认0.85)
"""
def __init__(self,
llm,
retriever=None,
max_rounds: int = 10,
confidence_threshold: float = 0.85):
self.llm = llm
self.retriever = retriever
self.max_rounds = max_rounds
self.confidence_threshold = confidence_threshold
# 推理链存储
self.reasoning_chain: List[ReasoningStep] = []
# 统计信息
self.stats = {
"total_rounds": 0,
"total_retrievals": 0,
"total_verifications": 0,
"confidence_history": []
}
def reason(self, task: str) -> Dict[str, Any]:
"""
深度推理主方法
Args:
task: 推理任务
Returns:
{
'answer': str, # 最终答案
'reasoning_chain': List, # 完整推理链
'confidence': float, # 最终置信度
'stats': Dict # 统计信息
}
"""
print("="*80)
print("Deep Research Agent - 深度推理模式")
print("="*80)
print(f"\n任务: {task}")
print(f"最大轮次: {self.max_rounds}")
print(f"置信度阈值: {self.confidence_threshold}\n")
# 初始化
self.reasoning_chain = []
context = []
current_confidence = 0.0
# 推理循环
for round_num in range(1, self.max_rounds + 1):
print(f"\n{'='*80}")
print(f"推理轮次 {round_num}/{self.max_rounds}")
print(f"{'='*80}")
# 步骤1: 生成推理步骤
step = self._generate_reasoning_step(task, context, round_num)
self.reasoning_chain.append(step)
print(f"\n推理步骤 {step.step_id}:")
print(f" 内容: {step.content[:100]}...")
print(f" 置信度: {step.confidence:.2%}")
# 步骤2: 自我验证
print(f"\n>>> 验证推理步骤...")
verification = self._verify_reasoning_step(step, context)
self.stats['total_verifications'] += 1
print(f" 验证结果: {'✓ 通过' if verification.is_valid else '✗ 未通过'}")
print(f" 整体置信度: {verification.confidence:.2%}")
if verification.issues:
print(f" 发现问题: {len(verification.issues)}个")
# 步骤3: 检查是否需要补充信息
if verification.missing_info:
print(f"\n>>> 检测到缺失信息: {len(verification.missing_info)}项")
# 补充检索
if self.retriever:
print(f"\n>>> 执行补充检索...")
new_context = self._retrieve_missing_info(
verification.missing_info
)
context.extend(new_context)
self.stats['total_retrievals'] += 1
print(f" 新增上下文: {len(new_context)}条")
# 更新置信度
current_confidence = verification.confidence
self.stats['confidence_history'].append(current_confidence)
# 步骤4: 检查是否可以结束
if self._should_stop(verification, round_num):
print(f"\n✓ 推理完成!")
print(f" 最终置信度: {current_confidence:.2%}")
print(f" 总轮次: {round_num}")
break
# 生成最终答案
print(f"\n{'='*80}")
print("生成最终答案...")
print(f"{'='*80}\n")
final_answer = self._generate_final_answer(task, self.reasoning_chain)
self.stats['total_rounds'] = round_num
return {
'answer': final_answer,
'reasoning_chain': [step.to_dict() for step in self.reasoning_chain],
'confidence': current_confidence,
'stats': self.stats
}
# 辅助方法将在下面实现...
def _generate_reasoning_step(self, task: str, context: List[str], round_num: int) -> ReasoningStep:
"""生成推理步骤 (简化版)"""
# 实际实现会调用LLM
return ReasoningStep(
step_id=len(self.reasoning_chain) + 1,
content=f"第{round_num}轮推理: 分析任务关键点...",
evidence=["根据已有信息..."],
confidence=0.7,
questions=["是否需要更多证据?"]
)
def _verify_reasoning_step(self, step: ReasoningStep, context: List[str]) -> VerificationResult:
"""验证推理步骤 (简化版)"""
# 实际实现会调用LLM进行严格验证
return VerificationResult(
is_valid=step.confidence > 0.5,
confidence=step.confidence,
issues=["逻辑稍显松散"] if step.confidence < 0.8 else [],
missing_info=["需要更多数据"] if step.confidence < 0.7 else []
)
def _retrieve_missing_info(self, missing_info: List[str]) -> List[str]:
"""补充检索缺失信息"""
# 实际实现会调用检索器
return [f"关于'{info}'的信息..." for info in missing_info[:2]]
def _should_stop(self, verification: VerificationResult, round_num: int) -> bool:
"""判断是否应该停止推理"""
if verification.confidence >= self.confidence_threshold:
return True
if verification.is_valid and not verification.missing_info:
return True
if round_num >= 3 and len(verification.issues) == 0:
return True
return False
def _generate_final_answer(self, task: str, reasoning_chain: List[ReasoningStep]) -> str:
"""生成最终答案"""
chain_summary = "\n".join([f"步骤{s.step_id}: {s.content}" for s in reasoning_chain])
return f"基于深度推理,对任务'{task}'的答案如下:\n\n{chain_summary}"
def visualize_reasoning_chain(self) -> str:
"""可视化推理链"""
if not self.reasoning_chain:
return "暂无推理链"
lines = ["# 深度推理链\n"]
for step in self.reasoning_chain:
lines.append(f"## 步骤 {step.step_id}")
lines.append(f"**内容**: {step.content}")
lines.append(f"**置信度**: {step.confidence:.2%}")
if step.evidence:
lines.append(f"**证据**: {', '.join(step.evidence)}")
lines.append("")
return "\n".join(lines)
print("DeepResearchAgent类定义完成!")
class DeepResearchAgent:
"""
Deep Research Agent
特点:
1. 长推理链 (10+轮)
2. 每步自我验证
3. 主动补充检索
4. 推理链可视化
5. 多角度自我质疑
Args:
llm: LLM实例
retriever: 检索器实例
max_rounds: 最大推理轮数 (默认10)
confidence_threshold: 置信度阈值 (默认0.85)
"""
def __init__(self,
llm,
retriever=None,
max_rounds: int = 10,
confidence_threshold: float = 0.85):
self.llm = llm
self.retriever = retriever
self.max_rounds = max_rounds
self.confidence_threshold = confidence_threshold
# 推理链存储
self.reasoning_chain: List[ReasoningStep] = []
# 统计信息
self.stats = {
"total_rounds": 0,
"total_retrievals": 0,
"total_verifications": 0,
"confidence_history": []
}
def reason(self, task: str) -> Dict[str, Any]:
"""
深度推理主方法
Args:
task: 推理任务
Returns:
{
'answer': str, # 最终答案
'reasoning_chain': List, # 完整推理链
'confidence': float, # 最终置信度
'stats': Dict # 统计信息
}
"""
print("="*80)
print("Deep Research Agent - 深度推理模式")
print("="*80)
print(f"\n任务: {task}")
print(f"最大轮次: {self.max_rounds}")
print(f"置信度阈值: {self.confidence_threshold}\n")
# 初始化
self.reasoning_chain = []
context = []
current_confidence = 0.0
# 推理循环
for round_num in range(1, self.max_rounds + 1):
print(f"\n{'='*80}")
print(f"推理轮次 {round_num}/{self.max_rounds}")
print(f"{'='*80}")
# 步骤1: 生成推理步骤
step = self._generate_reasoning_step(task, context, round_num)
self.reasoning_chain.append(step)
print(f"\n推理步骤 {step.step_id}:")
print(f" 内容: {step.content[:100]}...")
print(f" 置信度: {step.confidence:.2%}")
# 步骤2: 自我验证
print(f"\n>>> 验证推理步骤...")
verification = self._verify_reasoning_step(step, context)
self.stats['total_verifications'] += 1
print(f" 验证结果: {'✓ 通过' if verification.is_valid else '✗ 未通过'}")
print(f" 整体置信度: {verification.confidence:.2%}")
if verification.issues:
print(f" 发现问题: {len(verification.issues)}个")
# 步骤3: 检查是否需要补充信息
if verification.missing_info:
print(f"\n>>> 检测到缺失信息: {len(verification.missing_info)}项")
# 补充检索
if self.retriever:
print(f"\n>>> 执行补充检索...")
new_context = self._retrieve_missing_info(
verification.missing_info
)
context.extend(new_context)
self.stats['total_retrievals'] += 1
print(f" 新增上下文: {len(new_context)}条")
# 更新置信度
current_confidence = verification.confidence
self.stats['confidence_history'].append(current_confidence)
# 步骤4: 检查是否可以结束
if self._should_stop(verification, round_num):
print(f"\n✓ 推理完成!")
print(f" 最终置信度: {current_confidence:.2%}")
print(f" 总轮次: {round_num}")
break
# 生成最终答案
print(f"\n{'='*80}")
print("生成最终答案...")
print(f"{'='*80}\n")
final_answer = self._generate_final_answer(task, self.reasoning_chain)
self.stats['total_rounds'] = round_num
return {
'answer': final_answer,
'reasoning_chain': [step.to_dict() for step in self.reasoning_chain],
'confidence': current_confidence,
'stats': self.stats
}
# 辅助方法将在下面实现...
def _generate_reasoning_step(self, task: str, context: List[str], round_num: int) -> ReasoningStep:
"""生成推理步骤 (简化版)"""
# 实际实现会调用LLM
return ReasoningStep(
step_id=len(self.reasoning_chain) + 1,
content=f"第{round_num}轮推理: 分析任务关键点...",
evidence=["根据已有信息..."],
confidence=0.7,
questions=["是否需要更多证据?"]
)
def _verify_reasoning_step(self, step: ReasoningStep, context: List[str]) -> VerificationResult:
"""验证推理步骤 (简化版)"""
# 实际实现会调用LLM进行严格验证
return VerificationResult(
is_valid=step.confidence > 0.5,
confidence=step.confidence,
issues=["逻辑稍显松散"] if step.confidence < 0.8 else [],
missing_info=["需要更多数据"] if step.confidence < 0.7 else []
)
def _retrieve_missing_info(self, missing_info: List[str]) -> List[str]:
"""补充检索缺失信息"""
# 实际实现会调用检索器
return [f"关于'{info}'的信息..." for info in missing_info[:2]]
def _should_stop(self, verification: VerificationResult, round_num: int) -> bool:
"""判断是否应该停止推理"""
if verification.confidence >= self.confidence_threshold:
return True
if verification.is_valid and not verification.missing_info:
return True
if round_num >= 3 and len(verification.issues) == 0:
return True
return False
def _generate_final_answer(self, task: str, reasoning_chain: List[ReasoningStep]) -> str:
"""生成最终答案"""
chain_summary = "\n".join([f"步骤{s.step_id}: {s.content}" for s in reasoning_chain])
return f"基于深度推理,对任务'{task}'的答案如下:\n\n{chain_summary}"
def visualize_reasoning_chain(self) -> str:
"""可视化推理链"""
if not self.reasoning_chain:
return "暂无推理链"
lines = ["# 深度推理链\n"]
for step in self.reasoning_chain:
lines.append(f"## 步骤 {step.step_id}")
lines.append(f"**内容**: {step.content}")
lines.append(f"**置信度**: {step.confidence:.2%}")
if step.evidence:
lines.append(f"**证据**: {', '.join(step.evidence)}")
lines.append("")
return "\n".join(lines)
print("DeepResearchAgent类定义完成!")
4. 使用示例¶
创建并使用Deep Research Agent。
In [ ]:
Copied!
# 模拟LLM
class MockLLM:
def predict(self, prompt):
return "模拟LLM响应"
# 创建Deep Research Agent
llm = MockLLM()
agent = DeepResearchAgent(
llm=llm,
retriever=None, # 简化示例,不使用检索器
max_rounds=5,
confidence_threshold=0.85
)
print("Agent创建成功!")
print(f" 最大轮次: {agent.max_rounds}")
print(f" 置信度阈值: {agent.confidence_threshold}")
# 模拟LLM
class MockLLM:
def predict(self, prompt):
return "模拟LLM响应"
# 创建Deep Research Agent
llm = MockLLM()
agent = DeepResearchAgent(
llm=llm,
retriever=None, # 简化示例,不使用检索器
max_rounds=5,
confidence_threshold=0.85
)
print("Agent创建成功!")
print(f" 最大轮次: {agent.max_rounds}")
print(f" 置信度阈值: {agent.confidence_threshold}")
In [ ]:
Copied!
# 执行深度推理任务
task = "解释为什么e^(iπ) + 1 = 0"
result = agent.reason(task)
print("\n" + "="*80)
print("最终答案:")
print("="*80)
print(result['answer'])
# 执行深度推理任务
task = "解释为什么e^(iπ) + 1 = 0"
result = agent.reason(task)
print("\n" + "="*80)
print("最终答案:")
print("="*80)
print(result['answer'])
In [ ]:
Copied!
# 查看推理链
print("\n" + "="*80)
print("推理链可视化:")
print("="*80)
print(agent.visualize_reasoning_chain())
# 查看推理链
print("\n" + "="*80)
print("推理链可视化:")
print("="*80)
print(agent.visualize_reasoning_chain())
In [ ]:
Copied!
# 查看统计信息
print("\n" + "="*80)
print("统计信息:")
print("="*80)
import json
print(json.dumps(result['stats'], indent=2, ensure_ascii=False))
# 查看统计信息
print("\n" + "="*80)
print("统计信息:")
print("="*80)
import json
print(json.dumps(result['stats'], indent=2, ensure_ascii=False))
5. 对比不同Agent模式¶
让我们对比不同Agent模式的差异。
In [ ]:
Copied!
import pandas as pd
# 创建对比表格
comparison_data = {
"模式": ["ReAct", "Plan-Execute", "Self-Reflection", "Deep Research"],
"推理深度": ["⭐⭐", "⭐⭐⭐", "⭐⭐⭐", "⭐⭐⭐⭐⭐"],
"验证强度": ["⭐", "⭐⭐", "⭐⭐⭐", "⭐⭐⭐⭐⭐"],
"适用场景": ["快速问答", "任务执行", "质量优化", "复杂推理"],
"成本": ["低", "中", "中", "高"],
"典型轮次": ["1-3轮", "3-5轮", "2-3轮", "10+轮"]
}
df = pd.DataFrame(comparison_data)
print("\nAgent模式对比:")
print("="*80)
print(df.to_string(index=False))
import pandas as pd
# 创建对比表格
comparison_data = {
"模式": ["ReAct", "Plan-Execute", "Self-Reflection", "Deep Research"],
"推理深度": ["⭐⭐", "⭐⭐⭐", "⭐⭐⭐", "⭐⭐⭐⭐⭐"],
"验证强度": ["⭐", "⭐⭐", "⭐⭐⭐", "⭐⭐⭐⭐⭐"],
"适用场景": ["快速问答", "任务执行", "质量优化", "复杂推理"],
"成本": ["低", "中", "中", "高"],
"典型轮次": ["1-3轮", "3-5轮", "2-3轮", "10+轮"]
}
df = pd.DataFrame(comparison_data)
print("\nAgent模式对比:")
print("="*80)
print(df.to_string(index=False))
6. 实战练习¶
练习1: 实现完整的LLM集成¶
上面的示例使用了MockLLM。请尝试:
- 集成真实的LLM (如OpenAI GPT-4)
- 实现完整的推理步骤生成
- 实现严格的验证机制
练习2: 添加检索器¶
- 集成向量检索器
- 实现智能的补充检索
- 优化检索结果的使用
练习3: 优化推理链可视化¶
- 添加推理链的图形化展示
- 显示步骤之间的依赖关系
- 高亮显示关键推理步骤
总结¶
本节学习了Deep Research Agent的核心概念:
- 长推理链: 通过10+轮逐步推理实现深度思考
- 自我验证: 每步都进行严格的逻辑验证
- 主动补充: 主动发现并补充缺失信息
- 可解释性: 输出完整的推理链供审查
下一步:
- 尝试在实际问题中应用Deep Research
- 优化推理质量和效率
- 结合其他Agent模式构建混合系统