Introduction
Deep research AI agents represent a new frontier in AI - systems capable of autonomously conducting comprehensive research on any topic. Unlike simple search tools, these agents can plan research strategies, execute multi-step investigations, synthesize findings, and produce polished reports. This guide covers everything from understanding how these systems work to building your own research automation.
Understanding Deep Research Agents
What is a Deep Research Agent?
A deep research agent is an AI system designed to conduct thorough investigations on complex topics. Unlike traditional search engines or chatbots, deep research agents can:
- Decompose complex questions into researchable sub-questions
- Execute multi-stage research plans
- Evaluate source credibility and synthesize conflicting information
- Produce comprehensive, well-cited reports
- Self-correct when initial research paths fail
graph TB
subgraph "Deep Research Pipeline"
Input[Research Query]
Plan[Research Planning]
Search[Multi-Source Search]
Eval[Source Evaluation]
Synth[Synthesis & Analysis]
Report[Report Generation]
Input --> Plan
Plan --> Search
Search --> Eval
Eval --> Synth
Synth --> Report
Search -.->|new findings| Plan
Eval -.->|re-evaluate| Search
end
How Deep Research Differs from Regular Search
| Aspect | Regular Search | Deep Research Agent |
|---|---|---|
| Query Understanding | Keyword matching | Intent analysis + decomposition |
| Search Results | Single round | Iterative, multi-round |
| Source Quality | User evaluates | Agent evaluates credibility |
| Synthesis | Manual | Automatic synthesis |
| Output | Link list | Comprehensive report |
| Time | Seconds | Minutes to hours |
| Depth | Surface level | Deep, multi-faceted |
Leading Deep Research Systems
| System | Developer | Key Features | Best For |
|---|---|---|---|
| Perplexity Deep Research | Perplexity AI | Real-time sources, cited answers | General research |
| Manus | Monica AI | Autonomous execution, file handling | Complex multi-step research |
| Claude Research | Anthropic | Strong reasoning, web browsing | Academic/research |
| Gemini Deep Research | Google ecosystem, YouTube | Multimedia research | |
| ChatGPT Deep Research | OpenAI | GPT-4o, structured reports | Comprehensive analysis |
| Grok Research | xAI | Real-time news, X/Twitter | Current events |
Architecture Deep Dive
Core Components
# Deep Research Agent Architecture
class DeepResearchAgent:
"""Complete deep research agent implementation"""
def __init__(self, config: ResearchConfig):
self.config = config
self.llm = create_llm(config.model)
self.search_tools = config.search_tools
self.web_browser = config.web_browser
self.storage = config.storage
self.research_plan: list[ResearchTask] = []
self.findings: list[Finding] = []
self.sources: list[Source] = []
async def research(self, query: str, depth: str = "comprehensive") -> ResearchReport:
"""Execute deep research on a query"""
print(f"๐ Analyzing query: {query}")
self.research_plan = await self.create_research_plan(query, depth)
print(f"๐ Executing {len(self.research_plan)} research tasks...")
for i, task in enumerate(self.research_plan):
print(f" Task {i+1}/{len(self.research_plan)}: {task.description}")
results = await self.execute_search(task)
validated = await self.evaluate_sources(results)
findings = await self.extract_findings(validated, task)
self.findings.extend(findings)
self.sources.extend(validated)
await self.refine_research(task, findings)
print("๐ฌ Synthesizing findings...")
synthesis = await self.synthesize_findings()
print("๐ Generating report...")
report = await self.generate_report(synthesis)
return report
async def create_research_plan(self, query: str, depth: str) -> list[ResearchTask]:
"""Create research plan from query"""
prompt = f"""
Create a research plan for: "{query}"
Depth level: {depth}
Break down this research into specific tasks that cover:
1. Background and context
2. Current state and recent developments
3. Key players and stakeholders
4. Technical details (if applicable)
5. Challenges and limitations
6. Future outlook
7. Practical applications
Return as JSON array with description, search_terms, priority.
"""
response = await self.llm.complete(prompt)
tasks = json.loads(response)
return [ResearchTask(**task) for task in tasks]
async def execute_search(self, task: ResearchTask) -> list[SearchResult]:
"""Execute search for a research task"""
results = []
for search_tool in self.search_tools:
search_results = await search_tool.search(
query=task.search_terms,
num_results=10,
type=task.search_type
)
results.extend(search_results)
return self.deduplicate_results(results)
async def evaluate_sources(self, results: list[SearchResult]) -> list[Source]:
"""Evaluate and validate sources"""
validated = []
for result in results:
if not await self.can_access_url(result.url):
continue
credibility = await self.evaluate_credibility(result)
if credibility.score >= self.config.min_credibility_score:
validated.append(Source(
url=result.url,
title=result.title,
content=result.snippet,
credibility_score=credibility.score,
relevance=credibility.relevance,
published_date=credibility.published_date
))
return validated
async def evaluate_credibility(self, result: SearchResult) -> CredibilityScore:
"""Evaluate source credibility"""
prompt = f"""
Evaluate credibility of this source:
Title: {result.title}
URL: {result.url}
Snippet: {result.snippet}
Return JSON with score (0-100), relevance (0-100), published_date.
"""
response = await self.llm.complete(prompt)
return CredibilityScore(**json.loads(response))
async def extract_findings(self, sources: list[Source], task: ResearchTask) -> list[Finding]:
"""Extract key findings from sources"""
prompt = f"""
Extract key findings from sources for task: {task.description}
Sources:
{self.format_sources(sources)}
Return JSON array with content, supporting_sources, confidence (high/medium/low).
"""
response = await self.llm.complete(prompt)
return [Finding(**f) for f in json.loads(response)]
async def synthesize_findings(self) -> Synthesis:
"""Synthesize all findings"""
prompt = f"""
Synthesize findings for: {self.config.original_query}
Findings:
{self.format_findings(self.findings)}
Create synthesis that addresses original question, reconciles conflicts, identifies gaps.
"""
response = await self.llm.complete(prompt)
return Synthesis(**json.loads(response))
async def generate_report(self, synthesis: Synthesis) -> ResearchReport:
"""Generate final research report"""
prompt = f"""
Generate comprehensive research report.
Topic: {self.config.original_query}
Synthesis:
{json.dumps(synthesis.raw_data)}
Sources: {self.format_sources(self.sources)}
Format: Executive Summary, Introduction, Key Findings, Analysis, Conclusions, References.
Use citations [1], [2], etc.
"""
report_content = await self.llm.complete(prompt)
return ResearchReport(
topic=self.config.original_query,
content=report_content,
sources=self.sources,
findings=self.findings,
synthesis=synthesis
)
Source Evaluation System
# Source Credibility Evaluation
class SourceEvaluator:
"""Evaluate source credibility"""
TRUSTED_DOMAINS = {
"arxiv.org": 95,
"pubmed.ncbi.nlm.nih.gov": 95,
"nature.com": 95,
"science.org": 95,
"ieee.org": 90,
"acm.org": 90,
".gov": 90,
"reuters.com": 85,
"bloomberg.com": 85,
"wsj.com": 80,
"ft.com": 85,
"github.com": 80,
"stackoverflow.com": 75,
}
SUSPICIOUS_DOMAINS = ["clickbait", "fake-news", "conspiracy"]
async def evaluate(self, url: str, title: str, snippet: str, query: str) -> SourceEvaluation:
"""Comprehensive source evaluation"""
domain_score = self.get_domain_credibility(url)
freshness = await self.check_freshness(url)
relevance = self.calculate_relevance(snippet, query)
red_flags = self.check_red_flags(url, title, snippet)
final_score = self.calculate_final_score(domain_score, freshness, relevance, red_flags)
return SourceEvaluation(
url=url,
domain_score=domain_score,
freshness_score=freshness,
relevance_score=relevance,
red_flags=red_flags,
final_score=final_score,
recommendation=self.get_recommendation(final_score, red_flags)
)
def get_domain_credibility(self, url: str) -> float:
"""Get credibility score from domain"""
url_lower = url.lower()
for domain, score in self.TRUSTED_DOMAINS.items():
if domain in url_lower:
return score
for suspicious in self.SUSPICIOUS_DOMAINS:
if suspicious in url_lower:
return 10
return 50
def calculate_relevance(self, snippet: str, query: str) -> float:
"""Calculate content relevance"""
query_terms = set(query.lower().split())
snippet_terms = set(snippet.lower().split())
overlap = len(query_terms & snippet_terms)
return min(100, (overlap / len(query_terms)) * 100 + 50)
def check_red_flags(self, url: str, title: str, snippet: str) -> list[str]:
"""Check for red flags"""
flags = []
if title.isupper() and len(title) > 20:
flags.append("clickbait_title")
if any(url.endswith(tld) for tld in [".xyz", ".top", ".click"]):
flags.append("suspicious_tld")
if len(url) > 200:
flags.append("suspiciously_long_url")
return flags
Building Your Own Research Agent
Basic Implementation
#!/usr/bin/env python3
"""Simple Deep Research Agent"""
import asyncio
import json
from dataclasses import dataclass
from openai import AsyncOpenAI
@dataclass
class ResearchConfig:
model: str = "gpt-4"
max_sources: int = 20
search_iterations: int = 3
@dataclass
class Finding:
content: str
source: str
confidence: str
class SimpleResearchAgent:
"""Lightweight research agent"""
def __init__(self, config: ResearchConfig = None):
self.config = config or ResearchConfig()
self.client = AsyncOpenAI()
self.findings = []
self.sources = []
async def research(self, query: str) -> dict:
"""Execute research"""
print(f"๐ Researching: {query}")
plan = await self.create_plan(query)
for iteration in range(self.config.search_iterations):
print(f" Iteration {iteration + 1}/{self.config.search_iterations}")
for task in plan:
results = await self.search(task)
for result in results:
if result not in self.sources:
self.sources.append(result)
finding = await self.extract_finding(result, query)
self.findings.append(finding)
report = await self.synthesize(query)
return {
"query": query,
"report": report,
"sources": self.sources,
"findings": self.findings
}
async def create_plan(self, query: str) -> list[str]:
"""Create research tasks"""
response = await self.client.chat.completions.create(
model=self.config.model,
messages=[
{"role": "system", "content": "Create research plan. Return JSON array of tasks."},
{"role": "user", "content": query}
],
response_format={"type": "json_object"}
)
tasks = json.loads(response.choices[0].message.content)
return tasks.get("tasks", [query])
async def search(self, query: str) -> list[dict]:
"""Execute search (integrate Tavily, Serper, etc.)"""
return [{"url": "https://example.com", "title": "Result", "content": "Sample content"}]
async def extract_finding(self, result: dict, query: str) -> Finding:
"""Extract finding from result"""
response = await self.client.chat.completions.create(
model=self.config.model,
messages=[
{"role": "system", "content": "Extract key finding. Return JSON with content, confidence."},
{"role": "user", "content": f"Query: {query}\nResult: {result}"}
],
response_format={"type": "json_object"}
)
data = json.loads(response.choices[0].message.content)
return Finding(
content=data.get("content", ""),
source=result.get("url", ""),
confidence=data.get("confidence", "medium")
)
async def synthesize(self, query: str) -> str:
"""Synthesize findings"""
findings_text = "\n\n".join([
f"- {f.content} (Source: {f.source})"
for f in self.findings[:10]
])
response = await self.client.chat.completions.create(
model=self.config.model,
messages=[
{"role": "system", "content": "Write research report with citations."},
{"role": "user", "content": f"Topic: {query}\n\nFindings:\n{findings_text}"}
]
)
return response.choices[0].message.content
# Usage
async def main():
agent = SimpleResearchAgent()
result = await agent.research("What are the latest developments in quantum computing?")
print(result["report"])
asyncio.run(main())
Research APIs
| API | Strengths | Pricing |
|---|---|---|
| Tavily | AI-optimized, semantic | Free + $5/mo pro |
| Serper | Google results, fast | 2,500/mo free |
| Brave Search | Privacy-focused | Free tier |
| Exa | AI-powered content | Free available |
Tavily Integration
from tavily import TavilyClient
class TavilyResearchClient:
def __init__(self, api_key: str):
self.client = TavilyClient(api_key=api_key)
async def search(self, query: str, num_results: int = 10) -> list[SearchResult]:
"""Search using Tavily"""
response = self.client.search(
query=query,
num_results=num_results,
search_depth="advanced",
include_answer=True,
include_raw_content=True
)
return [
SearchResult(
url=r["url"],
title=r["title"],
content=r.get("content", r.get("answer", "")),
snippet=r["content"][:300] if r.get("content") else r.get("answer", ""),
score=r.get("score", 0)
)
for r in response["results"]
]
Best Practices
1. Source Diversity
async def ensure_diversity(sources: list[Source]) -> bool:
"""Check source diversity"""
domains = [urlparse(s.url).netloc for s in sources]
unique_domains = set(domains)
return len(unique_domains) >= min(5, len(sources) * 0.5)
2. Cross-Reference Findings
async def verify_claim(claim: str, sources: list[Source]) -> VerificationResult:
"""Verify claim against multiple sources"""
supporting = []
opposing = []
for source in sources:
if await self.source_supports_claim(source, claim):
supporting.append(source)
elif await self.source_opposes_claim(source, claim):
opposing.append(source)
return VerificationResult(
claim=claim,
supporting=supporting,
opposing=opposing,
consensus=len(supporting) > len(opposing) * 2
)
Conclusion
Deep research AI agents transform how we gather and synthesize information. Key points:
- Architecture: Planning, search, evaluation, synthesis, reporting
- Tools: Tavily, Serper, web browsers for content extraction
- Evaluation: Multi-factor source credibility assessment
- Synthesis: LLM-powered analysis and report generation
- Use Cases: Academic, market, technical, news research
Start with simple implementations and add sophistication as needed.
Comments