Multi-Agent Systems: Building Collaborative AI Networks

Introduction

Single agents are powerful. Multi-agent systems are transformative. When multiple AI agents work together, they can tackle problems no single agent could solve - dividing complex tasks, specializing in different domains, and collaborating like a team.

This guide covers everything about building multi-agent systems: architectures, communication patterns, orchestration strategies, and real-world implementations.

Why Multi-Agent Systems?

┌─────────────────────────────────────────────────────────────────────┐
│              SINGLE vs MULTI-AGENT                                        │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│   SINGLE AGENT                    MULTI-AGENT                          │
│   ─────────────                   ──────────────                      │
│                                                                      │
│   ┌─────────┐                     ┌─────────┐                        │
│   │  Agent  │                     │   Coord │                        │
│   └────┬────┘                     └────┬────┘                        │
│        │                               │                              │
│        ▼                               ▼                              │
│   ┌─────────┐               ┌──────┬──────┬──────┐                  │
│   │ Complex │               │  Ag1 │  Ag2 │  Ag3 │                  │
│   │ Task    │               └──┬───┘──┬───┘──┬───┘                  │
│   │         │                  │      │      │                       │
│   │ Fails!  │                  ▼      ▼      ▼                       │
│   └─────────┘               ┌──────────────┐                        │
│                             │   Combined   │                        │
│                             │   Result     │                        │
│                             └──────────────┘                        │
│                                                                      │
│   Limited by:                 Benefits:                             │
│   • Context window            • Specialization                      │
│   • Single expertise          • Parallel execution                   │
│   • Single viewpoint          • Redundancy                          │
│   • No collaboration          • Scalability                         │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

Architecture Patterns

1. Hierarchical Architecture

┌─────────────────────────────────────────────────────────────────────┐
│              HIERARCHICAL AGENT SYSTEM                                    │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│                         ┌─────────┐                                  │
│                         │  CEO    │  (Strategic)                    │
│                         │  Agent  │                                  │
│                         └────┬────┘                                  │
│                              │                                        │
│              ┌───────────────┼───────────────┐                       │
│              │               │               │                       │
│              ▼               ▼               ▼                       │
│         ┌────────┐     ┌────────┐     ┌────────┐                    │
│         │  Eng   │     │  Prod  │     │  Ops   │  (Tactical)       │
│         │  Lead  │     │  Lead  │     │  Lead  │                   │
│         └───┬────┘     └────┬────┘     └────┬────┘                   │
│             │               │               │                        │
│    ┌────────┼────────┐     │        ┌──────┼──────┐                │
│    │        │        │     │        │      │      │                 │
│    ▼        ▼        ▼     ▼        ▼      ▼      ▼                 │
│ ┌────┐  ┌────┐  ┌────┐ ┌────┐  ┌────┐  ┌────┐  ┌────┐           │
│ │Ag1 │  │Ag2 │  │Ag3 │ │Ag4 │  │Ag5 │  │Ag6 │  │Ag7 │  (Operational)
│ └────┘  └────┘  └────┘ └────┘  └────┘  └────┘  └────┘           │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

# Hierarchical agent implementation
class HierarchicalMultiAgent:
    def __init__(self):
        # Define hierarchy
        self.ceo = Agent(name="CEO", role="strategic")
        self.department_leads = {
            "engineering": Agent(name="EngLead", role="tactical"),
            "product": Agent(name="ProdLead", role="tactical"),
            "operations": Agent(name="OpsLead", role="tactical")
        }
        self.teams = {
            "engineering": [Agent(f"Engineer-{i}") for i in range(3)],
            "product": [Agent(f"PM-{i}") for i in range(2)],
            "operations": [Agent(f"Ops-{i}") for i in range(2)]
        }
    
    async def process_request(self, request: str) -> Response:
        # CEO determines strategy
        strategy = await self.ceo.analyze(request)
        
        # Route to appropriate department
        if strategy.department == "engineering":
            return await self.handle_engineering(strategy)
        elif strategy.department == "product":
            return await self.handle_product(strategy)
        # ...
    
    async def handle_engineering(self, strategy):
        lead = self.department_leads["engineering"]
        
        # Lead breaks into tasks
        tasks = await lead.decompose(strategy)
        
        # Team executes in parallel
        results = await asyncio.gather(*[
            agent.execute(task) 
            for agent, task in zip(self.teams["engineering"], tasks)
        ])
        
        # Lead synthesizes results
        return await lead.synthesize(results)

2. Network Architecture

┌─────────────────────────────────────────────────────────────────────┐
│              NETWORK AGENT SYSTEM                                         │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│                         ┌─────────┐                                  │
│                      ┌──│  Hub    │──┐                              │
│                      │  │  Agent  │  │                              │
│                      │  └─────────┘  │                              │
│                      │               │                               │
│         ┌────────────┼───────────────┼────────────┐                  │
│         │            │               │            │                  │
│         ▼            ▼               ▼            ▼                  │
│    ┌────────┐  ┌────────┐     ┌────────┐  ┌────────┐              │
│    │ Search │  │  Code  │     │  Data  │  │  Web   │              │
│    │ Agent  │  │ Agent  │────▶│ Agent  │──│ Agent  │              │
│    └────────┘  └────────┘     └────────┘  └────────┘              │
│        │            │               │            │                  │
│        └────────────┴───────────────┴────────────┘                  │
│                          │                                            │
│                          ▼                                            │
│                    ┌─────────┐                                        │
│                    │ Result  │                                        │
│                    │ Aggregat│                                        │
│                    └─────────┘                                        │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

3. Committee Architecture

# Multiple agents vote on decisions
class CommitteeMultiAgent:
    def __init__(self, agents: list):
        self.agents = agents
    
    async def make_decision(self, question: str) -> Decision:
        # Get opinions from all agents
        opinions = await asyncio.gather(*[
            agent.think(question) for agent in self.agents
        ])
        
        # Aggregate opinions
        return self.vote(opinions)
    
    def vote(self, opinions: list) -> Decision:
        # Majority vote
        votes = {}
        for opinion in opinions:
            votes[opinion.choice] = votes.get(opinion.choice, 0) + 1
        
        winner = max(votes, key=votes.get)
        
        return Decision(
            choice=winner,
            confidence=votes[winner] / len(opinions),
            opinions=opinions
        )

Communication Patterns

1. Message Passing

# Agent message passing
from dataclasses import dataclass
from typing import Optional

@dataclass
class AgentMessage:
    sender: str
    receiver: str
    content: Any
    type: str  # request, response, broadcast
    thread_id: str

class MessageBus:
    def __init__(self):
        self.inbox = defaultdict(list)
    
    async def send(self, message: AgentMessage):
        self.inbox[message.receiver].append(message)
    
    async def broadcast(self, sender: str, content: Any):
        message = AgentMessage(
            sender=sender,
            receiver="*",  # broadcast
            content=content,
            type="broadcast",
            thread_id=uuid.uuid4()
        )
        
        # Send to all except sender
        for agent_name in self.agents:
            if agent_name != sender:
                await self.send(message)
    
    async def receive(self, receiver: str) -> list:
        messages = self.inbox[receiver]
        self.inbox[receiver] = []
        return messages


# Agent using message bus
class CommunicatingAgent:
    def __init__(self, name: str, bus: MessageBus):
        self.name = name
        self.bus = bus
    
    async def request_help(self, target: str, request: str):
        msg = AgentMessage(
            sender=self.name,
            receiver=target,
            content=request,
            type="request"
        )
        await self.bus.send(msg)
    
    async def respond_to(self, request: AgentMessage, response: str):
        msg = AgentMessage(
            sender=self.name,
            receiver=request.sender,
            content=response,
            type="response",
            thread_id=request.thread_id
        )
        await self.bus.send(msg)

2. Shared State

# Agents share state via distributed store
class SharedStateManager:
    def __init__(self):
        self.state = {}
        self.lock = asyncio.Lock()
    
    async def read(self, key: str) -> Any:
        return self.state.get(key)
    
    async def write(self, key: str, value: Any):
        async with self.lock:
            self.state[key] = value
    
    async def update(self, key: str, updater: callable):
        async with self.lock:
            old = self.state.get(key)
            new = updater(old)
            self.state[key] = new


# Usage
state = SharedStateManager()

class SharedAgent:
    def __init__(self, name: str, state: SharedStateManager):
        self.name = name
        self.state = state
    
    async def contribute(self, key: str, data: dict):
        await self.state.update(key, lambda current: {
            **(current or {}),
            self.name: data
        })

3. Blackboard Pattern

# Shared blackboard for problem solving
class Blackboard:
    def __init__(self):
        self.content = {}
        self.subscribers = []
    
    def subscribe(self, agent, interest: str):
        self.subscribers.append((agent, interest))
    
    async def post(self, source: str, content: dict):
        self.content[source] = content
        
        # Notify interested agents
        for agent, interest in self.subscribers:
            if interest in content:
                await agent.notify(content)


class BlackboardAgent:
    def __init__(self, name: str, blackboard: Blackboard, interests: list):
        self.name = name
        self.blackboard = blackboard
        for interest in interests:
            blackboard.subscribe(self, interest)
    
    async def notify(self, content: dict):
        # Process new information
        result = await self.process(content)
        
        if result:
            await self.blackboard.post(self.name, result)

Specialization Strategies

1. Role-Based Agents

# Define specialized agents
specialists = {
    "researcher": Agent(
        name="Researcher",
        role="Find information",
        system_prompt="""You are a research specialist. 
        Your job is to find accurate, relevant information.""",
        tools=["web_search", "browse", "read_pdf"]
    ),
    
    "coder": Agent(
        name="Coder",
        role="Write code",
        system_prompt="""You are a coding specialist.
        Your job is to write clean, correct code.""",
        tools=["read_file", "write_file", "execute_code"]
    ),
    
    "reviewer": Agent(
        name="Reviewer",
        role="Review and critique",
        system_prompt="""You are a review specialist.
        Your job is to find issues and improve quality.""",
        tools=["analyze_code", "run_tests"]
    ),
    
    "writer": Agent(
        name="Writer",
        role="Create content",
        system_prompt="""You are a writing specialist.
        Your job is to create clear, engaging content.""",
        tools=["write_file", "format_markdown"]
    )
}

# Workflow with specialists
async def code_review_pipeline(code: str):
    # Research
    context = await specialists["researcher"].execute(
        "Find best practices for this code pattern"
    )
    
    # Write with context
    code = await specialists["coder"].execute(
        f"Write code considering: {context}"
    )
    
    # Review
    issues = await specialists["reviewer"].execute(code)
    
    # Fix
    if issues:
        code = await specialists["coder"].execute(
            f"Fix these issues: {issues}"
        )
    
    # Document
    docs = await specialists["writer"].execute(
        f"Document this code: {code}"
    )
    
    return {"code": code, "docs": docs}

2. Dynamic Agent Selection

class DynamicRouter:
    def __init__(self, agents: dict):
        self.agents = agents
    
    async def route(self, task: str) -> Agent:
        # Analyze task requirements
        requirements = await self.analyze_task(task)
        
        # Score each agent
        scores = {}
        for name, agent in self.agents.items():
            score = await self.score_agent(agent, requirements)
            scores[name] = score
        
        # Select best agent
        best = max(scores, key=scores.get)
        
        return self.agents[best]
    
    async def score_agent(self, agent: Agent, requirements: dict) -> float:
        # Simple heuristic scoring
        score = 0
        
        # Check tool match
        for req_tool in requirements.get("tools", []):
            if req_tool in agent.tools:
                score += 1
        
        # Check domain match
        for keyword in requirements.get("keywords", []):
            if keyword in agent.system_prompt.lower():
                score += 0.5
        
        return score

Collaboration Patterns

1. Sequential Pipeline

# Agents work in sequence
async def sequential_pipeline(task: str, agents: list):
    result = task
    
    for agent in agents:
        result = await agent.execute(result)
    
    return result


# Example: Research -> Write -> Edit -> Publish
workflow = sequential_pipeline(
    "Write about AI agents",
    agents=[
        research_agent,  # Gather information
        outline_agent,    # Create structure
        writer_agent,    # Write content
        editor_agent,    # Edit and refine
        publisher_agent  # Format and publish
    ]
)

2. Parallel Execution

# Agents work simultaneously
async def parallel_pipeline(task: str, agents: list):
    # All agents work on same task
    results = await asyncio.gather(*[
        agent.execute(task) for agent in agents
    ])
    
    # Combine results
    return combine_results(results)


# Example: Multiple perspectives
perspectives = await parallel_pipeline(
    "Analyze this investment",
    agents=[
        risk_agent,      # Analyze risks
        opportunity_agent,  # Find opportunities
        compliance_agent,   # Check compliance
        financial_agent    # Model financials
    ]
)

# Agents refine each other's work
async def iterative_refinement(task: str, agent_a: Agent, agent_b: Agent, iterations: int = 3):
    current = await agent_a.execute(task)
    
    for _ in range(iterations):
        # Agent B critiques
        feedback = await agent_b.review(current)
        
        # Agent A improves
        current = await agent_a.improve(current, feedback)
        
        # Check if converged
        if feedback.is_acceptable:
            break
    
    return current


# Example: Writer/Editor
final_draft = await iterative_refinement(
    article,
    writer=writer_agent,
    editor=editor_agent,
    iterations=5
)

Real-World Examples

1. Customer Support Team

support_team = {
    "triage": Agent(
        name="Triage Agent",
        role="Route inquiries",
        tools=["classify_intent", "extract_entities"]
    ),
    
    "technical": Agent(
        name="Technical Support",
        role="Solve technical issues",
        tools=["search_kb", "run_diagnostics", "reset_password"]
    ),
    
    "billing": Agent(
        name="Billing Support",
        role="Handle payments",
        tools=["check_balance", "process_refund", "update_subscription"]
    ),
    
    "escalation": Agent(
        name="Escalation Manager",
        role="Handle complex cases",
        tools=["summarize_case", "notify_human"]
    )
}

async def handle_support_ticket(ticket):
    # Triage first
    category = await support_team["triage"].classify(ticket)
    
    # Route to specialist
    if category == "technical":
        result = await support_team["technical"].solve(ticket)
    elif category == "billing":
        result = await support_team["billing"].resolve(ticket)
    
    # Escalate if needed
    if result.needs_escalation:
        await support_team["escalation"].notify(result)
    
    return result

2. Development Team

dev_team = {
    "architect": Agent(
        name="System Architect",
        role="Design systems"
    ),
    
    "backend": Agent(
        name="Backend Developer",
        role="Build APIs"
    ),
    
    "frontend": Agent(
        name="Frontend Developer",
        role="Build UI"
    ),
    
    "qa": Agent(
        name="QA Engineer",
        role="Test"
    )
}

async def build_feature(feature_spec):
    # Design
    design = await dev_team["architect"].design(feature_spec)
    
    # Split work
    backend_spec, frontend_spec = design.split()
    
    # Parallel development
    backend, frontend = await asyncio.gather(
        dev_team["backend"].implement(backend_spec),
        dev_team["frontend"].implement(frontend_spec)
    )
    
    # Integration
    integrated = await dev_team["backend"].integrate(backend, frontend)
    
    # Test
    test_results = await dev_team["qa"].test(integrated)
    
    return test_results

Coordination Mechanisms

1. Task Allocation

class TaskAllocator:
    def __init__(self, agents: list):
        self.agents = agents
        self.assignments = {}
    
    async def allocate(self, tasks: list) -> dict:
        # Score each task for each agent
        scores = []
        for task in tasks:
            for agent in self.agents:
                score = await self.score(task, agent)
                scores.append((task, agent, score))
        
        # Greedy assignment
        scores.sort(key=lambda x: x[2], reverse=True)
        
        assignments = {}
        used_agents = set()
        
        for task, agent, score in scores:
            if agent not in used_agents:
                assignments[task] = agent
                used_agents.add(agent)
        
        return assignments
    
    async def score(self, task: Task, agent: Agent) -> float:
        # Consider: skill match, availability, past performance
        return 1.0  # Simplified

2. Conflict Resolution

class ConflictResolver:
    def resolve(self, agent_outputs: list) -> Any:
        # Different strategies
        
        # 1. Voting
        return self.vote(agent_outputs)
        
        # 2. Weighted voting
        return self.weighted_vote(agent_outputs)
        
        # 3. Consensus
        return self.consensus(agent_outputs)
        
        # 4. Arbitration (designated agent decides)
        return self.arbitrate(agent_outputs)
    
    def vote(self, outputs: list) -> Any:
        counts = {}
        for output in outputs:
            counts[output] = counts.get(output, 0) + 1
        return max(counts, key=counts.get)

Scaling Considerations

1. Agent Pool Management

class AgentPool:
    def __init__(self, agent_factory, size: int):
        self.pool = asyncio.Queue()
        self.size = size
        
        # Pre-populate
        for _ in range(size):
            self.pool.put_nowait(agent_factory())
    
    async def acquire(self, timeout: float = 30) -> Agent:
        try:
            return await asyncio.wait_for(
                self.pool.get(),
                timeout=timeout
            )
        except asyncio.TimeoutError:
            # Scale up
            agent = await self.create_agent()
            return agent
    
    async def release(self, agent: Agent):
        await self.pool.put(agent)

2. Load Balancing

class LoadBalancer:
    def __init__(self, agents: list):
        self.agents = agents
        self.current = 0
    
    def get_next(self) -> Agent:
        # Round-robin
        agent = self.agents[self.current]
        self.current = (self.current + 1) % len(self.agents)
        return agent
    
    def get_least_loaded(self) -> Agent:
        # Pick agent with fewest active tasks
        return min(self.agents, key=lambda a: a.active_tasks)

Conclusion

Multi-agent systems unlock capabilities beyond single agents:

Specialization - Agents excel at specific tasks
Collaboration - Agents work together effectively
Scalability - Add more agents for more throughput
Robustness - Redundancy prevents single points of failure
Flexibility - Dynamic routing and task allocation

Choose your architecture based on your use case: hierarchical for structured organizations, network for flexible collaboration, committee for critical decisions.

Multi-Agent Systems: Building Collaborative AI Networks

Introduction

Why Multi-Agent Systems?

Architecture Patterns

1. Hierarchical Architecture

2. Network Architecture

3. Committee Architecture

Communication Patterns

1. Message Passing

2. Shared State

3. Blackboard Pattern

Specialization Strategies

1. Role-Based Agents

2. Dynamic Agent Selection

Collaboration Patterns

1. Sequential Pipeline

2. Parallel Execution

3. Iterative Refinement

Real-World Examples

1. Customer Support Team

2. Development Team

Coordination Mechanisms

1. Task Allocation

2. Conflict Resolution

Scaling Considerations

1. Agent Pool Management

2. Load Balancing

Conclusion

Resources

Comments

Share this article

👍 Was this article helpful?