Skip to main content
โšก Calmops

Rust-based AI Agents: Building Autonomous Systems

Designing Reliable, Self-Directed AI Systems in Rust

Artificial intelligence is moving beyond single models answering questions. The next frontier is autonomous agentsโ€”AI systems that perceive their environment, make decisions, take actions, and iterate without human intervention. Think: an AI that researches topics, writes code, manages infrastructure, or runs scientific experiments autonomously.

Building these systems is complex. They require memory management, reliable concurrency, error handling at scale, and predictable performance. This is where Rust shines. In this post, we’ll explore how to build production-grade AI agents in Rust, covering architecture, frameworks, and practical examples.

What Are AI Agents?

An AI agent is a software entity that:

  1. Perceives its environment (reads data, API responses, sensor inputs)
  2. Reasons about it (using LLMs, decision trees, or other models)
  3. Acts on it (writes files, calls APIs, modifies state)
  4. Learns from outcomes (feedback loops, reward signals)

Classic Agent Loop

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Observe (Environment/State)            โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
           โ”‚
           โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Reason (LLM, Decision Logic)           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
           โ”‚
           โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Act (Execute Action/Call Tool)         โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
           โ”‚
           โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Learn (Update Memory, Feedback)        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
           โ”‚
           โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                      โ”‚ (repeat)
                      โ–ผ

Why Rust for AI Agents?

Safety and Reliability

  • No data races: Rust’s borrow checker prevents concurrent access bugsโ€”critical when agents coordinate actions
  • Memory safety: No crashes from null pointers or use-after-freeโ€”agents run unattended
  • Deterministic behavior: Type system ensures predictable agent behavior

Performance

  • Low latency: Agents making decisions in microseconds, not milliseconds
  • Efficient memory: Run multiple agents on limited hardware
  • Zero-cost abstractions: Complex agent logic with negligible overhead

Production Readiness

  • Single binary: Deploy agents anywhere without runtime dependencies
  • Fearless concurrency: Multi-agent systems with safe parallelism
  • Explicit error handling: Agents handle failures gracefully

Core Components of an AI Agent in Rust

1. State Management

Agents need to track their context, beliefs, and experience:

use std::collections::HashMap;
use chrono::{DateTime, Utc};

#[derive(Clone, Debug)]
pub struct AgentMemory {
    /// Short-term context for current task
    pub context: Vec<String>,
    /// Long-term facts and experiences
    pub knowledge: HashMap<String, String>,
    /// Timestamped observations
    pub observations: Vec<(DateTime<Utc>, String)>,
    /// Action history for reflection
    pub action_history: Vec<AgentAction>,
}

#[derive(Clone, Debug)]
pub struct AgentAction {
    pub action_type: String,
    pub description: String,
    pub result: Option<String>,
    pub timestamp: DateTime<Utc>,
}

impl AgentMemory {
    pub fn new() -> Self {
        Self {
            context: Vec::new(),
            knowledge: HashMap::new(),
            observations: Vec::new(),
            action_history: Vec::new(),
        }
    }

    pub fn add_observation(&mut self, observation: String) {
        self.observations.push((Utc::now(), observation));
    }

    pub fn recall_recent(&self, count: usize) -> Vec<String> {
        self.observations
            .iter()
            .rev()
            .take(count)
            .map(|(_, obs)| obs.clone())
            .collect()
    }
}

2. Tool Interface

Agents interact with the world through tools (APIs, functions, system calls):

use async_trait::async_trait;
use serde_json::{json, Value};

#[async_trait]
pub trait AgentTool: Send + Sync {
    /// Tool name (used by agent to identify it)
    fn name(&self) -> &str;

    /// Tool description (given to LLM for understanding)
    fn description(&self) -> &str;

    /// Tool parameters schema (for LLM to understand inputs)
    fn schema(&self) -> Value;

    /// Execute the tool
    async fn execute(&self, params: Value) -> Result<String, String>;
}

/// Example: Web search tool
pub struct WebSearchTool;

#[async_trait]
impl AgentTool for WebSearchTool {
    fn name(&self) -> &str {
        "web_search"
    }

    fn description(&self) -> &str {
        "Search the web for information about a topic"
    }

    fn schema(&self) -> Value {
        json!({
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Search query"
                }
            },
            "required": ["query"]
        })
    }

    async fn execute(&self, params: Value) -> Result<String, String> {
        let query = params["query"]
            .as_str()
            .ok_or("Missing query parameter")?;
        
        // Call search API
        let results = perform_search(query).await?;
        Ok(results)
    }
}

async fn perform_search(query: &str) -> Result<String, String> {
    // TODO: Implement actual search
    Ok(format!("Search results for: {}", query))
}

3. Agent Core

The agent orchestrates memory, reasoning, and tool use:

use async_trait::async_trait;
use serde_json::Value;

pub struct Agent {
    pub name: String,
    pub memory: AgentMemory,
    pub tools: Vec<Box<dyn AgentTool>>,
    pub llm_client: LLMClient,
}

impl Agent {
    pub fn new(name: String, llm_client: LLMClient) -> Self {
        Self {
            name,
            memory: AgentMemory::new(),
            tools: Vec::new(),
            llm_client,
        }
    }

    pub fn add_tool(&mut self, tool: Box<dyn AgentTool>) {
        self.tools.push(tool);
    }

    /// Main agent loop: think, plan, act
    pub async fn run(&mut self, goal: String) -> Result<String, String> {
        self.memory.context.push(goal.clone());
        
        let mut iterations = 0;
        const MAX_ITERATIONS: usize = 10;

        loop {
            iterations += 1;
            if iterations > MAX_ITERATIONS {
                return Err("Max iterations reached".to_string());
            }

            // Step 1: Think (get LLM response)
            let response = self.think().await?;
            
            // Step 2: Parse response for action
            let action = self.parse_response(&response)?;
            
            // Step 3: Check if done
            if action.action_type == "finish" {
                self.memory.context.push(action.description.clone());
                return Ok(action.description);
            }

            // Step 4: Execute action
            let result = self.execute_action(&action).await?;
            
            // Step 5: Update memory
            self.memory.action_history.push(AgentAction {
                action_type: action.action_type,
                description: action.description,
                result: Some(result.clone()),
                timestamp: chrono::Utc::now(),
            });

            self.memory.add_observation(result);
        }
    }

    async fn think(&self) -> Result<String, String> {
        // Build prompt with context, tools, and recent observations
        let tool_descriptions = self.tools.iter()
            .map(|tool| format!("{}: {}", tool.name(), tool.description()))
            .collect::<Vec<_>>()
            .join("\n");

        let prompt = format!(
            "You are an autonomous AI agent. Goal: {}\n\nAvailable tools:\n{}\n\nRecent observations:\n{}\n\nWhat is your next action?",
            self.memory.context.join(" "),
            tool_descriptions,
            self.memory.recall_recent(3).join("\n")
        );

        // Call LLM
        self.llm_client.complete(&prompt).await
    }

    fn parse_response(&self, response: &str) -> Result<ParsedAction, String> {
        // Parse LLM response to extract action
        // This is a simplified example; real implementation would be more robust
        
        if response.contains("finish") {
            Ok(ParsedAction {
                action_type: "finish".to_string(),
                description: response.to_string(),
                params: Value::Null,
            })
        } else if response.contains("search") {
            Ok(ParsedAction {
                action_type: "web_search".to_string(),
                description: response.to_string(),
                params: json!({"query": extract_query(response)}),
            })
        } else {
            Err("Unknown action".to_string())
        }
    }

    async fn execute_action(&self, action: &ParsedAction) -> Result<String, String> {
        let tool = self.tools.iter()
            .find(|t| t.name() == action.action_type)
            .ok_or(format!("Tool {} not found", action.action_type))?;

        tool.execute(action.params.clone()).await
    }
}

#[derive(Clone, Debug)]
struct ParsedAction {
    action_type: String,
    description: String,
    params: Value,
}

fn extract_query(response: &str) -> String {
    // TODO: Extract query from response
    response.to_string()
}

// Mock LLM client
pub struct LLMClient;

impl LLMClient {
    pub async fn complete(&self, prompt: &str) -> Result<String, String> {
        // TODO: Call actual LLM (Candle, Llama.rs, or API)
        Ok("search Wikipedia for quantum computing".to_string())
    }
}

Framework: rig-rs (Emerging Rust Agent Framework)

For production systems, consider rig-rs, a Rust framework designed specifically for AI agents:

[dependencies]
rig = "0.1"
tokio = { version = "1", features = ["full"] }
serde_json = "1"

Example: Using rig-rs

use rig::completion::Prompt;
use rig::agent::Agent;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Initialize agent
    let mut agent = Agent::new()
        .with_system_prompt("You are a helpful research assistant")
        .with_tools(vec![
            // Add tools here
        ]);

    // Run agent loop
    let result = agent.run("Research the history of Rust programming language").await?;
    
    println!("Agent result: {}", result);
    Ok(())
}

Multi-Agent Systems

Complex problems often require multiple specialized agents coordinating:

pub struct MultiAgentSystem {
    agents: HashMap<String, Agent>,
    coordinator: AgentCoordinator,
}

pub struct AgentCoordinator {
    message_queue: tokio::sync::mpsc::UnboundedSender<AgentMessage>,
}

#[derive(Clone, Debug)]
pub struct AgentMessage {
    pub from: String,
    pub to: String,
    pub content: String,
}

impl MultiAgentSystem {
    pub fn new() -> Self {
        let (tx, _rx) = tokio::sync::mpsc::unbounded_channel();
        Self {
            agents: HashMap::new(),
            coordinator: AgentCoordinator {
                message_queue: tx,
            },
        }
    }

    pub fn add_agent(&mut self, agent: Agent) {
        self.agents.insert(agent.name.clone(), agent);
    }

    pub async fn run(&mut self, goal: String) -> Result<String, String> {
        // Orchestrate multiple agents toward a shared goal
        // Each agent runs concurrently, communicating via message passing
        
        let handles: Vec<_> = self.agents
            .iter_mut()
            .map(|(name, agent)| {
                let goal = goal.clone();
                let name = name.clone();
                tokio::spawn(async move {
                    agent.run(goal).await
                })
            })
            .collect();

        // Aggregate results
        let mut results = Vec::new();
        for handle in handles {
            results.push(handle.await??);
        }

        Ok(results.join("\n"))
    }
}

Real-World Example: Code Review Agent

Here’s a practical agent that reviews code:

pub struct CodeReviewAgent {
    agent: Agent,
}

impl CodeReviewAgent {
    pub async fn new(llm_client: LLMClient) -> Self {
        let mut agent = Agent::new(
            "CodeReviewAgent".to_string(),
            llm_client,
        );

        // Add tools
        agent.add_tool(Box::new(ReadFileTool));
        agent.add_tool(Box::new(AnalyzeCodeTool));
        agent.add_tool(Box::new(GenerateReportTool));

        Self { agent }
    }

    pub async fn review(&mut self, file_path: &str) -> Result<String, String> {
        let goal = format!("Review the code in {} for bugs, style issues, and improvements", file_path);
        self.agent.run(goal).await
    }
}

// Tool implementations
pub struct ReadFileTool;

#[async_trait]
impl AgentTool for ReadFileTool {
    fn name(&self) -> &str {
        "read_file"
    }

    fn description(&self) -> &str {
        "Read and display the contents of a code file"
    }

    fn schema(&self) -> Value {
        json!({
            "type": "object",
            "properties": {
                "path": {
                    "type": "string",
                    "description": "File path to read"
                }
            },
            "required": ["path"]
        })
    }

    async fn execute(&self, params: Value) -> Result<String, String> {
        let path = params["path"].as_str().ok_or("Missing path")?;
        tokio::fs::read_to_string(path)
            .await
            .map_err(|e| e.to_string())
    }
}

pub struct AnalyzeCodeTool;

#[async_trait]
impl AgentTool for AnalyzeCodeTool {
    fn name(&self) -> &str {
        "analyze_code"
    }

    fn description(&self) -> &str {
        "Analyze code for issues and improvements"
    }

    fn schema(&self) -> Value {
        json!({
            "type": "object",
            "properties": {
                "code": {
                    "type": "string",
                    "description": "Code to analyze"
                }
            },
            "required": ["code"]
        })
    }

    async fn execute(&self, params: Value) -> Result<String, String> {
        let code = params["code"].as_str().ok_or("Missing code")?;
        // TODO: Implement actual analysis
        Ok(format!("Analysis of code:\n{}", code))
    }
}

pub struct GenerateReportTool;

#[async_trait]
impl AgentTool for GenerateReportTool {
    fn name(&self) -> &str {
        "generate_report"
    }

    fn description(&self) -> &str {
        "Generate a detailed code review report"
    }

    fn schema(&self) -> Value {
        json!({
            "type": "object",
            "properties": {
                "findings": {
                    "type": "string",
                    "description": "Code review findings"
                }
            },
            "required": ["findings"]
        })
    }

    async fn execute(&self, params: Value) -> Result<String, String> {
        let findings = params["findings"].as_str().ok_or("Missing findings")?;
        Ok(format!("# Code Review Report\n\n{}", findings))
    }
}

Key Design Patterns

1. Tool Safety

Wrap tool execution with sandboxing:

pub async fn execute_action_safely(&self, action: &ParsedAction) -> Result<String, String> {
    // Timeout protection
    let timeout = tokio::time::timeout(
        std::time::Duration::from_secs(30),
        self.execute_action(action)
    ).await;

    match timeout {
        Ok(Ok(result)) => Ok(result),
        Ok(Err(e)) => Err(e),
        Err(_) => Err("Tool execution timed out".to_string()),
    }
}

2. Memory Management

Implement bounded memory to prevent runaway context growth:

impl AgentMemory {
    pub fn prune_old_observations(&mut self, max_age_secs: i64) {
        let cutoff = Utc::now() - chrono::Duration::seconds(max_age_secs);
        self.observations.retain(|(timestamp, _)| *timestamp > cutoff);
    }

    pub fn limit_history(&mut self, max_items: usize) {
        if self.action_history.len() > max_items {
            self.action_history.drain(0..self.action_history.len() - max_items);
        }
    }
}

3. Observability

Log agent decisions for debugging and monitoring:

use tracing::{info, warn, error};

async fn run(&mut self, goal: String) -> Result<String, String> {
    info!("Agent {} starting with goal: {}", self.name, goal);
    
    // ... agent loop ...
    
    info!("Agent {} completed: {}", self.name, result);
    Ok(result)
}

Challenges and Solutions

Challenge 1: Token Limits

Solution: Implement prompt optimization and summarization

fn compress_memory(&self) -> String {
    // Summarize old observations
    // Keep only recent action history
    // Use embeddings for retrieval-augmented generation
}

Challenge 2: Infinite Loops

Solution: Enforce iteration limits and duplicate detection

pub async fn run(&mut self, goal: String) -> Result<String, String> {
    let mut iterations = 0;
    let mut seen_actions = HashSet::new();

    loop {
        iterations += 1;
        if iterations > MAX_ITERATIONS {
            return Err("Max iterations exceeded".to_string());
        }

        let action = self.think().await?;
        if seen_actions.contains(&action.action_type) && iterations > 3 {
            return Err("Agent is repeating actions".to_string());
        }
        seen_actions.insert(action.action_type.clone());
        
        // ... execute action ...
    }
}

Challenge 3: Tool Hallucination

Solution: Strict tool validation before execution

fn validate_action(&self, action: &ParsedAction) -> Result<(), String> {
    // Verify tool exists
    self.tools.iter()
        .find(|t| t.name() == action.action_type)
        .ok_or(format!("Unknown tool: {}", action.action_type))?;

    // Validate parameters against schema
    // Reject unsafe patterns (command injection, etc.)
    
    Ok(())
}

Emerging Frameworks & Libraries

  • rig-rs: Agent framework with LLM integrations
  • autogen-rs: Rust binding of Microsoft’s AutoGen
  • async-trait: Enable async trait methods (essential for agents)
  • tokio: Async runtime for concurrent agents
  • tracing: Structured logging for agent observability

Conclusion

Rust is uniquely suited for building reliable, efficient AI agents. The combination of:

  • Memory safety โ†’ reliable autonomous systems
  • Concurrency โ†’ multi-agent coordination
  • Performance โ†’ real-time decision-making
  • Deployment โ†’ single binaries, low overhead

…makes Rust the ideal platform for the next generation of autonomous AI systems.

Whether you’re building a personal research assistant, a data pipeline orchestrator, or a fleet of coordinated agents, Rust provides the safety guarantees and performance characteristics needed for production deployments.

Resources

Comments