Introduction
The landscape of AI application development has been transformed by the introduction of agent frameworks. OpenAI’s Agents SDK represents a significant leap forward in building autonomous AI systems that can reason, use tools, and collaborate to accomplish complex tasks. This comprehensive guide covers everything you need to know to build production-ready AI agents using the OpenAI Agents SDK.
What is OpenAI Agents SDK?
Overview
The OpenAI Agents SDK is a lightweight but powerful framework designed for building AI-powered agent applications. It represents a significant evolution from OpenAI’s experimental Swarm project, providing production-ready primitives for creating sophisticated multi-agent systems.
The SDK enables developers to create agents that can:
- Use tools to interact with external systems
- Hand off tasks between specialized agents
- Maintain conversation context across interactions
- Execute complex workflows with built-in safety guardrails
Key Components
The SDK is built around three core primitives:
| Component | Description |
|---|---|
| Agents | LLMs configured with specific instructions, tools, and behaviors |
| Tools | Functions that agents can call to perform actions |
| Handoffs | Mechanisms for agents to transfer control to other agents |
Getting Started
Installation
pip install openai-agents
Basic Agent Creation
from agents import Agent, function_tool
@function_tool
def get_weather(location: str) -> str:
"""Get the weather for a specific location."""
# Implement weather API call
return f"The weather in {location} is sunny, 72ยฐF"
weather_agent = Agent(
name="Weather Agent",
instructions="You are a helpful weather assistant. Use the get_weather tool to answer questions about weather.",
tools=[get_weather]
)
Agent Architecture
Defining Agents
Agents are the core building blocks of the SDK. Each agent has:
from agents import Agent, ModelSettings
agent = Agent(
name="Research Assistant",
instructions="""You are a research assistant helping users find information.
Guidelines:
- Always cite your sources
- Provide balanced perspectives on controversial topics
- Admit when you don't know something""",
tools=[search_web, fetch_url, get_weather],
tool_use_frequency="auto", # or "always" or "never"
model="gpt-4o",
model_settings=ModelSettings(
temperature=0.7,
max_tokens=4000
)
)
Model Settings
Fine-tune agent behavior with model settings:
from agents import ModelSettings
settings = ModelSettings(
temperature=0.7, # Controls randomness (0.0 - 2.0)
max_tokens=4096, # Maximum response length
top_p=0.9, # Nucleus sampling
parallel_tool_calls=True # Allow parallel tool execution
)
Tool Integration
Creating Tools
Tools extend agent capabilities by enabling interaction with external systems:
from agents import function_tool
import requests
@function_tool
def search_web(query: str, max_results: int = 5) -> list:
"""Search the web for information.
Args:
query: The search query
max_results: Maximum number of results to return
Returns:
List of search results with title, url, and snippet
"""
# Implement search logic
response = requests.get(
"https://api.search.example.com/search",
params={"q": query, "limit": max_results}
)
return response.json()["results"]
@function_tool
def send_email(to: str, subject: str, body: str) -> dict:
"""Send an email message.
Args:
to: Recipient email address
subject: Email subject line
body: Email body content
"""
# Implement email sending
return {"status": "sent", "message_id": "msg_123"}
Tool Parameters with Pydantic
Define complex tool parameters using Pydantic models:
from pydantic import BaseModel
from typing import List
from agents import function_tool
class CalendarEvent(BaseModel):
title: str
description: str = ""
start_time: str # ISO 8601 format
end_time: str
attendees: List[str] = []
@function_tool
def create_calendar_event(event: CalendarEvent) -> dict:
"""Create a calendar event."""
# Implement calendar API call
return {
"status": "created",
"event_id": "evt_456",
"event": event.dict()
}
Async Tools
For I/O-bound operations, use async tools:
import asyncio
from agents import afunction_tool
@afunction_tool
async def fetch_multiple_urls(urls: List[str]) -> List[dict]:
"""Fetch content from multiple URLs concurrently."""
async with asyncio.ClientSession() as session:
tasks = [
session.get(url)
for url in urls
]
responses = await asyncio.gather(*tasks)
return [
{"url": url, "status": r.status}
for url, r in zip(urls, responses)
]
Multi-Agent Systems
Handoffs
The handoff mechanism allows agents to delegate tasks to specialized agents:
from agents import Agent, handoff
# Create specialized agents
triage_agent = Agent(
name="Triage Agent",
instructions="Route customer requests to the appropriate specialist.",
handoffs=[
handoff(
agent=technical_support_agent,
condition=lambda context: "technical" in context.user_input.lower()
),
handoff(
agent=billing_agent,
condition=lambda context: "bill" in context.user_input.lower() or "payment" in context.user_input.lower()
),
handoff(
agent=general_support_agent,
condition=lambda context: True # Default fallback
)
]
)
# Alternative: direct handoff
sales_agent = Agent(
name="Sales Agent",
instructions="Handle sales inquiries and product questions."
)
support_agent = Agent(
name="Support Agent",
instructions="Handle technical support and troubleshooting."
)
# Agent can explicitly hand off
agent = Agent(
name="Main Agent",
instructions="""You are the main customer service agent.
For sales inquiries, handoff to the sales agent.
For technical issues, handoff to the support agent.""",
handoffs=[sales_agent, support_agent]
)
Agent Pools
Create pools of agents for parallel processing:
from agents import Agent
import asyncio
# Create multiple worker agents
worker_agents = [
Agent(
name=f"Worker {i}",
instructions="Process tasks efficiently and accurately."
)
for i in range(5)
]
async def process_batch(tasks: list):
"""Process multiple tasks in parallel."""
async with asyncio.TaskGroup() as tg:
results = [
tg.create_task(agent.run(task))
for task, agent in zip(tasks, worker_agents)
]
return [r.result() for r in results]
Guardrails
Input Guardrails
Validate and filter user input before processing:
from agents import Agent, input_guardrail
import re
@input_guardrail
def validate_email_input(context):
"""Ensure user input doesn't contain email addresses."""
user_input = context.user_input
email_pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'
if re.search(email_pattern, user_input):
return {
"valid": False,
"reason": "Please don't include email addresses in your query."
}
return {"valid": True}
agent = Agent(
name="Customer Service Agent",
instructions="You are a helpful customer service agent.",
input_guardrails=[validate_email_input]
)
Output Guardrails
Validate agent responses before returning to users:
from agents import output_guardrail
@output_guardrail
def sanitize_output(context):
"""Ensure output doesn't contain sensitive information."""
output = context.agent_response
# Check for potential sensitive data
sensitive_patterns = [
r'\b\d{3}-\d{2}-\d{4}\b', # SSN
r'\b\d{16}\b', # Credit card
]
for pattern in sensitive_patterns:
if re.search(pattern, output):
return {
"valid": False,
"reason": "Response contained sensitive information and was filtered."
}
return {"valid": True}
agent = Agent(
name="Data Processing Agent",
instructions="Process user data according to privacy guidelines.",
output_guardrails=[sanitize_output]
)
Context Guardrails
Implement rate limiting and abuse prevention:
from datetime import datetime, timedelta
from collections import defaultdict
class RateLimiter:
def __init__(self, max_requests_per_minute: int = 10):
self.max_requests = max_requests_per_minute
self.requests = defaultdict(list)
def check_rate_limit(self, user_id: str) -> bool:
now = datetime.utcnow()
cutoff = now - timedelta(minutes=1)
# Clean old requests
self.requests[user_id] = [
req_time for req_time in self.requests[user_id]
if req_time > cutoff
]
if len(self.requests[user_id]) >= self.max_requests:
return False
self.requests[user_id].append(now)
return True
rate_limiter = RateLimiter(max_requests_per_minute=10)
@input_guardrail
def rate_limit_check(context):
"""Apply rate limiting per user."""
user_id = context.user_id
if not rate_limiter.check_rate_limit(user_id):
return {
"valid": False,
"reason": "Rate limit exceeded. Please try again later."
}
return {"valid": True}
Production Patterns
Streaming Responses
async def stream_agent_response(agent: Agent, user_input: str):
"""Stream agent responses for better UX."""
from agents import Runner
result = Runner.run_streaming(
agent=agent,
input=user_input
)
async for event in result.stream_events():
if event.type == "agent_message":
print(event.message.content, end="", flush=True)
elif event.type == "tool_call":
print(f"\n[Using tool: {event.tool_call.name}]")
Error Handling
from agents import Agent, RunResult
from enum import Enum
class AgentError(Exception):
def __init__(self, message: str, recoverable: bool = False):
self.message = message
self.recoverable = recoverable
super().__init__(message)
async def safe_agent_run(agent: Agent, user_input: str, max_retries: int = 3):
"""Run agent with error handling and retries."""
from agents import Runner
last_error = None
for attempt in range(max_retries):
try:
result = await Runner.run(agent, user_input)
return result
except Exception as e:
last_error = e
if not is_recoverable_error(e):
raise AgentError(str(e), recoverable=False)
# Exponential backoff
await asyncio.sleep(2 ** attempt)
raise AgentError(
f"Agent failed after {max_retries} attempts: {last_error}",
recoverable=True
)
def is_recoverable_error(error: Exception) -> bool:
"""Determine if an error is recoverable."""
recoverable_messages = [
"rate limit",
"timeout",
"temporary failure"
]
error_str = str(error).lower()
return any(msg in error_str for msg in recoverable_messages)
Memory and Context
from agents import Agent
from typing import List, Dict
class ConversationMemory:
def __init__(self, max_history: int = 10):
self.max_history = max_history
self.history: List[Dict] = []
def add_message(self, role: str, content: str):
"""Add a message to conversation history."""
self.history.append({"role": role, "content": content})
# Trim if needed
if len(self.history) > self.max_history:
self.history = self.history[-self.max_history:]
def get_context(self) -> str:
"""Get formatted context for agent."""
return "\n".join(
f"{msg['role']}: {msg['content']}"
for msg in self.history
)
# Usage
memory = ConversationMemory(max_history=10)
agent = Agent(
name="Conversational Agent",
instructions="""You are a helpful assistant with memory of our conversation.
Previous conversation:
{context}"""
)
# Build context before each run
context = memory.get_context()
result = await Runner.run(agent, user_input, context=context)
# Store interaction
memory.add_message("user", user_input)
memory.add_message("assistant", result.final_output)
Monitoring and Observability
Tracing Agent Executions
from agents import Agent
from opentelemetry import trace
tracer = trace.get_tracer(__name__)
class AgentTracer:
def __init__(self, service_name: str):
self.service_name = service_name
async def trace_agent_run(self, agent: Agent, user_input: str):
with tracer.start_as_current_span(
f"agent.{agent.name}",
attributes={
"service.name": self.service_name,
"agent.name": agent.name,
"user.input.length": len(user_input)
}
) as span:
from agents import Runner
result = await Runner.run(agent, user_input)
span.set_attribute(
"agent.output.length",
len(result.final_output)
)
span.set_attribute(
"agent.tool_calls",
len(result.tool_calls) if result.tool_calls else 0
)
return result
Logging
import structlog
from agents import Agent
logger = structlog.get_logger()
@function_tool
def logged_api_call(url: str) -> dict:
"""API call with logging."""
logger.info("api_call_start", url=url, tool="logged_api_call")
try:
result = make_api_call(url)
logger.info(
"api_call_success",
url=url,
status_code=result.status_code
)
return result.json()
except Exception as e:
logger.error(
"api_call_failed",
url=url,
error=str(e)
)
raise
Best Practices
1. Keep Instructions Focused
# Bad: Vague instructions
agent = Agent(
instructions="Be helpful."
)
# Good: Specific instructions
agent = Agent(
instructions="""You are a technical support agent for a SaaS product.
Your responsibilities:
1. Understand user technical issues
2. Provide troubleshooting steps
3. Escalate to human support when needed
Never:
- Provide legal advice
- Access user accounts without permission
- Share internal system information"""
)
2. Use Descriptive Tool Names
# Bad: Generic names
def do_something(x):
pass
# Good: Descriptive names
def create_calendar_event(event: CalendarEvent) -> dict:
"""Create a new event in the user's calendar."""
pass
def searchKnowledgeBase(query: str, category: str = None) -> list:
"""Search the knowledge base for relevant articles.
Args:
query: Search query string
category: Optional category filter
"""
pass
3. Implement Proper Error Handling
@function_tool
def robust_api_call(url: str, timeout: int = 30) -> dict:
"""Make API call with proper error handling."""
import httpx
try:
with httpx.Timeout(timeout):
response = httpx.get(url)
response.raise_for_status()
return response.json()
except httpx.TimeoutException:
return {"error": "Request timed out", "retryable": True}
except httpx.HTTPStatusError as e:
return {"error": f"HTTP error: {e.response.status_code}", "retryable": False}
except Exception as e:
return {"error": f"Unexpected error: {str(e)}", "retryable": False}
4. Test Agent Behavior
import pytest
from agents import Agent, Runner
def test_agent_routes_technical_queries():
"""Test that technical queries route to correct agent."""
technical_agent = Agent(name="Technical", instructions="Handle technical issues.")
billing_agent = Agent(name="Billing", instructions="Handle billing questions.")
triage = Agent(
name="Triage",
instructions="Route queries appropriately.",
handoffs=[technical_agent, billing_agent]
)
result = Runner.run_sync(
triage,
"My application is throwing a 500 error"
)
# Verify handoff occurred
assert result.final_output # Check routing logic worked
def test_guardrail_blocks_sensitive_data():
"""Test that guardrails filter sensitive data."""
agent = Agent(
name="Test Agent",
instructions="Process user requests.",
output_guardrails=[sanitize_output]
)
result = Runner.run_sync(
agent,
"Tell me about your services"
)
# Verify sensitive data was filtered
assert "123-45-6789" not in result.final_output
Common Pitfalls
1. Overcomplicating Agent Instructions
# Bad: Too many rules
agent = Agent(
instructions="""First, greet the user. Then, ask what they need help with.
If it's about X, do Y. If it's about A, do B.
Remember to be polite. Don't forget to say goodbye.
Also, check the time. If it's morning, say good morning.
If it's afternoon... [continues for 200 more lines]"""
)
# Good: Focused, modular instructions
agent = Agent(
instructions="""You are a customer service agent.
Core responsibilities:
- Answer customer questions
- Troubleshoot issues
- Escalate when needed
Guidelines: [brief list of key rules]"""
)
2. Not Handling Tool Failures
# Bad: No error handling
@function_tool
def get_data(url: str):
return requests.get(url).json()
# Good: Graceful error handling
@function_tool
def get_data(url: str):
try:
return {"data": requests.get(url).json()}
except requests.RequestException as e:
return {"error": str(e), "fallback_available": True}
3. Ignoring Context Limits
Be mindful of context window limits when building conversations:
- Include only relevant history
- Summarize older interactions
- Use structured data instead of verbose formats
External Resources
- OpenAI Agents SDK Documentation
- OpenAI Agents SDK GitHub
- OpenAI API Reference
- Function Calling Best Practices
- Agent Design Patterns
Conclusion
The OpenAI Agents SDK provides a powerful framework for building sophisticated AI agents. By understanding its core componentsโagents, tools, and handoffsโyou can create multi-agent systems capable of handling complex workflows.
Key takeaways:
- Start with simple, focused agents before building complex systems
- Use descriptive tool names and comprehensive parameter definitions
- Implement guardrails for security and reliability
- Add comprehensive logging and monitoring
- Test agent behavior thoroughly
As AI agents become more prevalent, mastering frameworks like the OpenAI Agents SDK will be essential for building production-ready AI applications that can reliably handle real-world tasks.
Comments