Introduction
Agentic AI represents a paradigm shift from passive AI systems to autonomous agents that can perceive, reason, plan, and act independently. Unlike traditional AI models that respond to single prompts, agentic AI systems can break down complex tasks, execute multi-step workflows, and adapt to changing conditions. This guide covers the architecture, implementation patterns, and best practices for building agentic AI systems in 2026.
Understanding Agentic AI
What Makes AI “Agentic”?
An AI agent possesses several key capabilities:
- Autonomy: Can operate independently without constant human guidance
- Planning: Can decompose complex tasks into manageable steps
- Tool Use: Can invoke external tools, APIs, and services
- Memory: Can maintain context across interactions
- Reflection: Can evaluate its own outputs and adjust behavior
- Multi-step Execution: Can pursue long-horizon goals
Agent vs. Model
| Aspect | Traditional Model | AI Agent |
|---|---|---|
| Input | Single prompt | Task with context |
| Output | One-shot response | Multi-step execution |
| Tools | None | Can use tools |
| Memory | Stateless | Maintains state |
| Adaptation | None | Learns from feedback |
Agent Architecture
Core Components
from dataclasses import dataclass, field
from typing import List, Dict, Any, Optional, Callable
from enum import Enum
import json
class AgentState(Enum):
IDLE = "idle"
THINKING = "thinking"
ACTING = "acting"
OBSERVING = "observing"
REFLECTING = "reflecting"
@dataclass
class Message:
role: str
content: str
metadata: Dict[str, Any] = field(default_factory=dict)
@dataclass
class Tool:
name: str
description: str
parameters: Dict[str, Any]
function: Callable
def execute(self, **kwargs) -> Any:
return self.function(**kwargs)
@dataclass
class AgentConfig:
model: str = "gpt-4"
temperature: float = 0.7
max_tokens: int = 4000
max_steps: int = 20
tools: List[Tool] = field(default_factory=list)
system_prompt: str = ""
verbose: bool = False
class ReActAgent:
"""Reasoning + Acting agent implementation."""
def __init__(self, config: AgentConfig):
self.config = config
self.messages: List[Message] = []
self.tools = {tool.name: tool for tool in config.tools}
self.steps_taken = 0
self.state = AgentState.IDLE
def add_system_message(self, content: str):
self.messages.append(Message("system", content))
def add_user_message(self, content: str):
self.messages.append(Message("user", content))
def think(self) -> str:
"""Reason about the current situation."""
self.state = AgentState.THINKING
prompt = self._build_prompt()
response = self._call_model(prompt)
if self.config.verbose:
print(f"[THINK] {response}")
return response
def act(self, action: str, params: Dict[str, Any]) -> Any:
"""Execute an action or tool."""
self.state = AgentState.ACTING
if action == "respond":
result = params["content"]
elif action in self.tools:
tool = self.tools[action]
result = tool.execute(**params)
self.messages.append(Message("tool", str(result),
{"tool": action}))
else:
result = f"Unknown action: {action}"
if self.config.verbose:
print(f"[ACT] {action}: {result}")
return result
def observe(self, observation: str):
"""Process environment feedback."""
self.state = AgentState.OBSERVING
self.messages.append(Message("observation", observation))
if self.config.verbose:
print(f"[OBSERVE] {observation}")
def reflect(self) -> bool:
"""Evaluate if goal is achieved."""
self.state = AgentState.REFLECTING
prompt = """Based on the conversation so far, has the user's
original request been fulfilled? Answer yes or no."""
response = self._call_model(prompt)
return "yes" in response.lower()
def run(self, task: str) -> str:
"""Execute task using ReAct loop."""
self.add_user_message(task)
self.steps_taken = 0
while self.steps_taken < self.config.max_steps:
# Think
thought = self.think()
# Extract action
action, params = self._parse_action(thought)
# Act
result = self.act(action, params)
# Observe
self.observe(f"Action result: {result}")
# Reflect
if self.reflect():
return self._get_final_response()
self.steps_taken += 1
return "Max steps reached without completion"
def _call_model(self, prompt: str) -> str:
# Placeholder - integrate with actual LLM API
pass
def _build_prompt(self) -> str:
# Build prompt with tools and context
pass
def _parse_action(self, thought: str) -> tuple:
# Parse model response to extract action
pass
Planning Agents
class PlanningAgent:
"""Agent with explicit planning capabilities."""
def __init__(self, llm, tools: List[Tool], config: AgentConfig):
self.llm = llm
self.tools = {t.name: t for t in tools}
self.plan = []
self.history = []
def create_plan(self, task: str) -> List[Dict[str, Any]]:
"""Break task into executable steps."""
prompt = f"""Break down this task into specific, executable steps.
For each step, specify:
- action: what to do
- parameters: what inputs needed
- expected_output: what result to expect
Task: {task}
Format as JSON array of steps."""
response = self.llm.complete(prompt)
steps = json.loads(response)
self.plan = steps
return steps
def execute_plan(self) -> List[Any]:
"""Execute plan with error handling."""
results = []
for i, step in enumerate(self.plan):
try:
action = step["action"]
params = step.get("parameters", {})
if action in self.tools:
result = self.tools[action].execute(**params)
else:
result = self._execute_llm_action(action, params)
results.append({"step": i, "success": True, "result": result})
self.history.append((step, result))
except Exception as e:
results.append({"step": i, "success": False, "error": str(e)})
# Try to recover
recovery_plan = self._create_recovery_plan(step, e)
if recovery_plan:
self.plan = recovery_plan + self.plan[i+1:]
return results
def revise_plan(self, feedback: str):
"""Revise plan based on feedback."""
prompt = f"""Given the original plan and feedback, create a revised plan.
Original plan: {json.dumps(self.plan)}
Feedback: {feedback}
Return revised plan as JSON array."""
response = self.llm.complete(prompt)
self.plan = json.loads(response)
Multi-Agent Systems
class AgentTeam:
"""Coordinator for multiple specialized agents."""
def __init__(self):
self.agents: Dict[str, Any] = {}
self.coordinator = None
def register_agent(self, name: str, agent: Any, role: str):
self.agents[name] = {"agent": agent, "role": role}
def set_coordinator(self, coordinator_agent: Any):
self.coordinator = coordinator_agent
def execute_task(self, task: str) -> Dict[str, Any]:
# Coordinator analyzes task
task_analysis = self.coordinator.analyze(task)
# Assign to appropriate agents
subtasks = task_analysis["subtasks"]
assignments = task_analysis["assignments"]
results = {}
for agent_name, subtask in zip(assignments, subtasks):
agent_info = self.agents[agent_name]
results[agent_name] = agent_info["agent"].execute(subtask)
# Synthesize results
final_result = self.coordinator.synthesize(results)
return final_result
class ResearchAgent:
"""Specialized agent for research tasks."""
def __init__(self, llm, search_tool, browser_tool):
self.llm = llm
self.search = search_tool
self.browser = browser_tool
def research(self, topic: str) -> Dict[str, Any]:
# Search for information
search_results = self.search.query(topic)
# Browse top results
articles = []
for result in search_results[:5]:
content = self.browser.fetch(result["url"])
articles.append({
"title": result["title"],
"content": content,
"source": result["url"]
})
# Synthesize
synthesis = self.llm.complete(f"""Synthesize research from these articles:
{articles}
Provide comprehensive summary.""")
return {"articles": articles, "synthesis": synthesis}
Tool Integration
Tool Definition Schema
def create_tool(name: str, description: str, parameters: Dict):
"""Define a tool for agent use."""
def decorator(func: Callable):
tool = Tool(
name=name,
description=description,
parameters=parameters,
function=func
)
return tool
return decorator
@create_tool(
name="search",
description="Search the web for information",
parameters={
"query": {"type": "string", "description": "Search query"},
"num_results": {"type": "integer", "description": "Number of results"}
}
)
def search_web(query: str, num_results: int = 5):
# Implementation
pass
@create_tool(
name="execute_code",
description="Execute Python code in a sandbox",
parameters={
"code": {"type": "string", "description": "Code to execute"},
"language": {"type": "string", "description": "Programming language"}
}
)
def execute_code(code: str, language: str = "python"):
# Implementation with sandboxing
pass
@create_tool(
name="read_file",
description="Read contents of a file",
parameters={
"path": {"type": "string", "description": "File path"},
"lines": {"type": "integer", "description": "Number of lines"}
}
)
def read_file(path: str, lines: int = 100):
# Implementation
pass
@create_tool(
name="write_file",
description="Write content to a file",
parameters={
"path": {"type": "string", "description": "File path"},
"content": {"type": "string", "description": "Content to write"},
"mode": {"type": "string", "description": "Write mode: overwrite/append"}
}
)
def write_file(path: str, content: str, mode: str = "overwrite"):
# Implementation
pass
Tool Selection
class ToolSelector:
"""Intelligent tool selection based on task."""
def __init__(self, llm, tools: Dict[str, Tool]):
self.llm = llm
self.tools = tools
def select_tools(self, task: str, max_tools: int = 3) -> List[Tool]:
"""Select most relevant tools for task."""
tool_descriptions = "\n".join([
f"- {name}: {tool.description}"
for name, tool in self.tools.items()
])
prompt = f"""Select the most relevant tools for this task.
Task: {task}
Available tools:
{tool_descriptions}
Return top {max_tools} tool names as comma-separated list."""
response = self.llm.complete(prompt)
selected_names = [name.strip() for name in response.split(",")]
return [self.tools[name] for name in selected_names if name in self.tools]
Memory Systems
Short-term and Long-term Memory
from collections import deque
class MemorySystem:
"""Multi-tier memory system for agents."""
def __init__(self, max_short_term: int = 10, vector_store=None):
self.short_term = deque(maxlen=max_short_term)
self.long_term = [] # Could use vector database
self.vector_store = vector_store
def add(self, content: str, memory_type: str = "short"):
"""Add to memory."""
if memory_type == "short":
self.short_term.append(content)
else:
self.long_term.append(content)
if self.vector_store:
self.vector_store.add(content)
def get_relevant(self, query: str, k: int = 5) -> List[str]:
"""Retrieve relevant memories."""
# Check short-term first
recent = list(self.short_term)
# Query long-term if needed
if self.vector_store:
long_term = self.vector_store.search(query, k)
return recent[-k:] + long_term[:k-len(recent[-k:])]
return recent[-k:]
def summarize(self) -> str:
"""Create summary of important memories."""
all_memories = list(self.short_term) + self.long_term
if not all_memories:
return "No memories."
prompt = f"""Summarize these key memories:
{all_memories}
Create a concise summary."""
return self.llm.complete(prompt)
class ConversationBuffer:
"""Buffer for maintaining conversation context."""
def __init__(self, max_tokens: int = 8000):
self.max_tokens = max_tokens
self.buffer = []
def add(self, role: str, content: str):
self.buffer.append({"role": role, "content": content})
self._trim()
def _trim(self):
"""Trim buffer to stay within token limit."""
# Simplified - in practice, count tokens
while len(self.buffer) > 20:
self.buffer.pop(0)
def get_messages(self) -> List[Dict]:
return self.buffer.copy()
Evaluation and Safety
Agent Evaluation Metrics
class AgentEvaluator:
"""Evaluate agent performance."""
def __init__(self):
self.metrics = {}
def evaluate(self, agent, test_cases: List[Dict]) -> Dict:
"""Run evaluation on test cases."""
results = {
"success_rate": 0,
"avg_steps": 0,
"avg_time": 0,
"error_types": {}
}
successes = 0
total_steps = 0
total_time = 0
for test in test_cases:
start = time.time()
try:
output = agent.run(test["task"])
elapsed = time.time() - start
success = self._check_success(output, test["expected"])
if success:
successes += 1
total_steps += agent.steps_taken
except Exception as e:
elapsed = time.time() - start
error_type = type(e).__name__
results["error_types"][error_type] = \
results["error_types"].get(error_type, 0) + 1
total_time += elapsed
n = len(test_cases)
results["success_rate"] = successes / n
results["avg_steps"] = total_steps / max(1, successes)
results["avg_time"] = total_time / n
return results
def _check_success(self, output, expected) -> bool:
# Implementation depends on task type
return expected in str(output)
Safety Guardrails
class SafetyGuardrails:
"""Safety checks for agent actions."""
def __init__(self):
self.denied_patterns = [
"harmful", "dangerous", "illegal", "malicious"
]
self.allowed_domains = ["github.com", "stackoverflow.com"]
def check_action(self, action: str, params: Dict) -> bool:
"""Validate action before execution."""
# Check for harmful content
action_str = f"{action} {json.dumps(params)}"
for pattern in self.denied_patterns:
if pattern in action_str.lower():
return False
# Check file operations
if action == "write_file":
path = params.get("path", "")
if not self._is_safe_path(path):
return False
# Check code execution
if action == "execute_code":
if not self._is_safe_code(params.get("code", "")):
return False
return True
def _is_safe_path(self, path: str) -> bool:
# Prevent path traversal
return ".." not in path and not path.startswith("/etc")
def _is_safe_code(self, code: str) -> bool:
dangerous = ["import os", "import sys", "subprocess", "eval("]
return not any(d in code for d in dangerous)
Best Practices
Agent Design Patterns
- Start Simple: Begin with single-step agents, add complexity as needed
- Tool Design: Create focused, composable tools
- Error Handling: Plan for failures at every step
- Observability: Log all decisions and actions
- Human in Loop: Allow human approval for critical actions
- Iterative Refinement: Use feedback to improve agent performance
Common Pitfalls
| Pitfall | Solution |
|---|---|
| Infinite loops | Set max steps, track visited states |
| Tool misuse | Validate inputs, add constraints |
| Hallucinations | Verify with external sources |
| Memory overflow | Implement summarization |
| Tool conflicts | Add tool conflict resolution |
Conclusion
Agentic AI represents the next frontier in AI development. By combining large language models with planning, tool use, and memory systems, we can create AI agents capable of tackling complex, multi-step tasks. The key to success lies in careful architecture design, robust safety measures, and thoughtful evaluation.
As you build agentic systems, remember to:
- Start with clear use cases
- Implement proper safety guardrails
- Monitor and evaluate continuously
- Design for failure modes
Resources
- LangChain Agents Documentation
- AutoGPT Architecture
- ReAct Prompting Paper
- Agent Bench: Evaluating LLMs as Agents
Comments