Introduction to AI Agents

Understand the fundamentals of AI agents and how they're transforming the way we interact with technology

Understanding AI Agents

AI agents represent the next evolution in artificial intelligence, moving beyond passive models that simply respond to prompts toward autonomous systems that can perceive, reason, plan, and act to achieve specific goals.

Key Insight

AI agents combine the reasoning capabilities of large language models with the ability to interact with the external world through tools, creating systems that can autonomously solve complex problems and perform tasks on behalf of users.

What Makes an AI Agent?

An AI agent is characterised by several key capabilities that distinguish it from traditional AI systems:

The Evolution of AI Systems

System Type Characteristics Examples
Traditional AI Rule-based, narrow focus, explicit programming Expert systems, chess engines, traditional chatbots
Machine Learning Data-driven, pattern recognition, statistical models Recommendation systems, image classifiers, predictive analytics
Large Language Models Generative, broad knowledge, natural language understanding ChatGPT, Claude, Llama, text generation systems
AI Agents Autonomous, goal-oriented, tool-using, adaptive Personal assistants, autonomous researchers, workflow automators

The Agent Cognition Loop

At the core of every AI agent is the agent cognition loop—a continuous cycle of perception, reasoning, planning, and action that allows the agent to interact with its environment and work toward its goals.

The Basic Agent Loop:

  1. Observe: Gather information from the environment or user input
  2. Orient: Interpret the information and update internal state
  3. Decide: Determine the best course of action based on goals and current state
  4. Act: Execute the chosen action using available tools
  5. Learn: Update knowledge and strategies based on outcomes

Agent Loop Implementation

class AIAgent:
    def __init__(self, llm, tools, memory=None):
        self.llm = llm
        self.tools = {tool.name: tool for tool in tools}
        self.memory = memory or []
        self.state = {}
    
    def run(self, user_input):
        """Execute the agent cognition loop."""
        # 1. Observe - gather input
        observation = self._observe(user_input)
        
        # 2. Orient - interpret and update state
        self._orient(observation)
        
        # Continue the loop until goal is reached or max iterations
        max_iterations = 10
        for _ in range(max_iterations):
            # 3. Decide - determine next action
            action = self._decide()
            
            # 4. Act - execute the action
            result = self._act(action)
            
            # 5. Learn - update knowledge based on result
            self._learn(action, result)
            
            # Check if we've reached the goal
            if self._goal_achieved():
                break
            
            # Update state with new observation
            self._orient(result)
        
        return self._generate_response()
    
    def _observe(self, user_input):
        """Gather information from user input."""
        return {"type": "user_input", "content": user_input}
    
    def _orient(self, observation):
        """Interpret the observation and update internal state."""
        # Add observation to memory
        self.memory.append(observation)
        
        # Update state based on observation
        if observation["type"] == "user_input":
            self.state["current_goal"] = self._extract_goal(observation["content"])
        elif observation["type"] == "tool_result":
            self.state["last_tool_result"] = observation["content"]
    
    def _decide(self):
        """Determine the next action based on current state."""
        # Construct prompt with memory and state
        prompt = self._construct_decision_prompt()
        
        # Get decision from LLM
        response = self.llm.predict(prompt)
        
        # Parse the response to extract tool name and arguments
        tool_name, tool_args = self._parse_tool_call(response)
        
        return {"tool": tool_name, "args": tool_args}
    
    def _act(self, action):
        """Execute the chosen action using available tools."""
        tool_name = action["tool"]
        tool_args = action["args"]
        
        if tool_name in self.tools:
            tool = self.tools[tool_name]
            try:
                result = tool(**tool_args)
                return {"type": "tool_result", "tool": tool_name, "content": result}
            except Exception as e:
                return {"type": "error", "tool": tool_name, "content": str(e)}
        else:
            return {"type": "error", "content": f"Tool {tool_name} not found"}
    
    def _learn(self, action, result):
        """Update knowledge and strategies based on action results."""
        # Add action and result to memory
        self.memory.append({"type": "action", "content": action})
        self.memory.append(result)
        
        # Update state based on result
        if result["type"] == "error":
            self.state["last_error"] = result["content"]
        
        # Could implement more sophisticated learning here
    
    def _goal_achieved(self):
        """Check if the current goal has been achieved."""
        # Construct prompt to check goal completion
        prompt = self._construct_goal_check_prompt()
        
        # Get assessment from LLM
        response = self.llm.predict(prompt)
        
        # Parse response to determine if goal is achieved
        return "GOAL_ACHIEVED" in response
    
    def _generate_response(self):
        """Generate a final response to the user."""
        # Construct prompt for response generation
        prompt = self._construct_response_prompt()
        
        # Get response from LLM
        return self.llm.predict(prompt)
    
    # Helper methods
    def _extract_goal(self, user_input):
        """Extract the user's goal from their input."""
        prompt = f"Extract the user's goal from their input: {user_input}"
        return self.llm.predict(prompt)
    
    def _construct_decision_prompt(self):
        """Construct a prompt for the decision phase."""
        # Implementation details omitted for brevity
        pass
    
    def _parse_tool_call(self, llm_response):
        """Parse the LLM's response to extract tool name and arguments."""
        # Implementation details omitted for brevity
        pass
    
    def _construct_goal_check_prompt(self):
        """Construct a prompt to check if the goal has been achieved."""
        # Implementation details omitted for brevity
        pass
    
    def _construct_response_prompt(self):
        """Construct a prompt for generating the final response."""
        # Implementation details omitted for brevity
        pass

Types of AI Agents

AI agents come in various forms, each designed for specific types of tasks and interaction patterns.

By Autonomy Level

Autonomy Spectrum:

Agent Type Autonomy Level Human Involvement Best For
Assistive Agents Low Frequent guidance and confirmation High-stakes decisions, creative tasks, personalised assistance
Collaborative Agents Medium Occasional input and direction Complex problem-solving, research, content creation
Autonomous Agents High Initial setup and periodic review Routine tasks, monitoring, data processing, scheduled actions
Fully Autonomous Systems Very High Oversight only Continuous operations, real-time responses, system management

By Functional Role

Common Agent Roles:

  • Personal Assistants: Help individuals with daily tasks, scheduling, information retrieval
  • Research Agents: Gather, analyze, and synthesize information from multiple sources
  • Creative Agents: Generate content, designs, or creative works based on specifications
  • Workflow Agents: Automate business processes and coordinate tasks across systems
  • Customer Service Agents: Handle inquiries, troubleshoot issues, and provide support
  • Data Agents: Process, analyze, and extract insights from large datasets
  • DevOps Agents: Monitor systems, detect issues, and manage infrastructure
  • Learning Agents: Personalize educational content and provide tutoring

By Architecture

Agent Architectures:

Architecture Description Advantages Limitations
Single-LLM Agents One LLM handles all reasoning and decision-making Simple, coherent reasoning, easy to implement Limited by context window, potential for hallucination
Multi-LLM Agents Different LLMs handle specialized tasks Specialized expertise, cost optimization Coordination overhead, potential inconsistencies
Hierarchical Agents Manager agents delegate to specialized sub-agents Complex task handling, separation of concerns Complex implementation, communication overhead
Multi-Agent Systems Multiple agents collaborate to solve problems Parallel processing, diverse perspectives Coordination challenges, resource intensive
Hybrid Symbolic-Neural Agents Combine LLMs with symbolic reasoning systems Reliable reasoning, verifiable outputs Implementation complexity, integration challenges

Core Components of AI Agents

Effective AI agents are built from several essential components that work together to enable autonomous operation.

1. Reasoning Engine

The reasoning engine is typically a large language model that provides the cognitive capabilities for the agent, including:

Reasoning Engine Selection

Choose your reasoning engine based on the agent's requirements:

  • GPT-4 or Claude 3 Opus: For complex reasoning and sophisticated agents
  • GPT-3.5 or Claude 3 Sonnet: For simpler agents with good balance of cost and performance
  • Llama 3 or Mistral: For locally-deployed agents with privacy requirements

2. Tool Use Framework

The tool use framework enables agents to interact with external systems, APIs, and data sources:

from typing import List, Dict, Any, Callable
from pydantic import BaseModel, Field

# Define a tool using Pydantic for parameter validation
class Tool:
    def __init__(self, name: str, description: str, function: Callable, schema: BaseModel = None):
        self.name = name
        self.description = description
        self.function = function
        self.schema = schema
    
    def __call__(self, **kwargs):
        """Execute the tool with the provided arguments."""
        # Validate arguments if schema is provided
        if self.schema:
            validated_args = self.schema(**kwargs).dict()
            return self.function(**validated_args)
        return self.function(**kwargs)
    
    def get_schema(self) -> Dict[str, Any]:
        """Get the JSON schema for this tool."""
        if self.schema:
            schema_dict = self.schema.schema()
            return {
                "name": self.name,
                "description": self.description,
                "parameters": schema_dict
            }
        return {
            "name": self.name,
            "description": self.description,
            "parameters": {"type": "object", "properties": {}}
        }

# Example tool definitions
class SearchParameters(BaseModel):
    query: str = Field(..., description="The search query")
    num_results: int = Field(5, description="Number of results to return")

def search_function(query: str, num_results: int = 5) -> List[Dict[str, str]]:
    """Simulated search function."""
    # In a real implementation, this would call a search API
    return [{"title": f"Result {i} for {query}", "url": f"https://example.com/{i}"} for i in range(num_results)]

search_tool = Tool(
    name="search",
    description="Search the web for information. Use this when you need to find facts or current information.",
    function=search_function,
    schema=SearchParameters
)

class CalculatorParameters(BaseModel):
    expression: str = Field(..., description="Mathematical expression to evaluate")

def calculator_function(expression: str) -> float:
    """Evaluate a mathematical expression."""
    try:
        return eval(expression)
    except Exception as e:
        return f"Error: {str(e)}"

calculator_tool = Tool(
    name="calculator",
    description="Evaluate mathematical expressions. Use this for calculations.",
    function=calculator_function,
    schema=CalculatorParameters
)

# Tool use framework
class ToolUseFramework:
    def __init__(self, tools: List[Tool]):
        self.tools = {tool.name: tool for tool in tools}
    
    def get_tool_descriptions(self) -> str:
        """Get formatted descriptions of all available tools."""
        descriptions = []
        for name, tool in self.tools.items():
            descriptions.append(f"{name}: {tool.description}")
        return "\n".join(descriptions)
    
    def get_tool_schemas(self) -> List[Dict[str, Any]]:
        """Get JSON schemas for all tools."""
        return [tool.get_schema() for tool in self.tools.values()]
    
    def execute_tool(self, tool_name: str, **kwargs) -> Any:
        """Execute a tool with the provided arguments."""
        if tool_name not in self.tools:
            return f"Error: Tool '{tool_name}' not found. Available tools: {', '.join(self.tools.keys())}"
        
        tool = self.tools[tool_name]
        try:
            return tool(**kwargs)
        except Exception as e:
            return f"Error executing {tool_name}: {str(e)}"

# Example usage
tools = [search_tool, calculator_tool]
tool_framework = ToolUseFramework(tools)

# Get tool descriptions for prompts
tool_descriptions = tool_framework.get_tool_descriptions()
print(tool_descriptions)

# Execute a tool
result = tool_framework.execute_tool("calculator", expression="2 + 2 * 3")
print(result)  # Output: 8

3. Memory Systems

Memory systems allow agents to maintain context, learn from past interactions, and build knowledge over time:

from typing import List, Dict, Any, Optional
import time
import json
from datetime import datetime

class MemorySystem:
    def __init__(self, max_short_term_items: int = 10):
        self.short_term_memory = []  # Recent interactions
        self.working_memory = {}     # Current task state
        self.long_term_memory = []   # Persistent knowledge
        self.max_short_term_items = max_short_term_items
    
    def add_to_short_term(self, item: Dict[str, Any]) -> None:
        """Add an item to short-term memory."""
        # Add timestamp if not present
        if "timestamp" not in item:
            item["timestamp"] = datetime.now().isoformat()
        
        self.short_term_memory.append(item)
        
        # Trim if exceeding max size
        if len(self.short_term_memory) > self.max_short_term_items:
            self.short_term_memory = self.short_term_memory[-self.max_short_term_items:]
    
    def update_working_memory(self, key: str, value: Any) -> None:
        """Update a value in working memory."""
        self.working_memory[key] = value
    
    def clear_working_memory(self) -> None:
        """Clear working memory for a new task."""
        self.working_memory = {}
    
    def add_to_long_term(self, item: Dict[str, Any]) -> None:
        """Add an item to long-term memory."""
        # Add timestamp if not present
        if "timestamp" not in item:
            item["timestamp"] = datetime.now().isoformat()
        
        self.long_term_memory.append(item)
    
    def search_long_term(self, query: str, limit: int = 5) -> List[Dict[str, Any]]:
        """Search long-term memory for relevant items.
        
        In a real implementation, this would use embeddings and vector search.
        This is a simplified version that does basic keyword matching.
        """
        results = []
        for item in self.long_term_memory:
            # Simple keyword matching (would use embeddings in practice)
            content = json.dumps(item).lower()
            if query.lower() in content:
                results.append(item)
                if len(results) >= limit:
                    break
        return results
    
    def get_relevant_context(self, query: str) -> Dict[str, Any]:
        """Get all relevant context for a query."""
        # Combine short-term and relevant long-term memory
        long_term_results = self.search_long_term(query)
        
        return {
            "short_term": self.short_term_memory,
            "working_memory": self.working_memory,
            "relevant_long_term": long_term_results
        }
    
    def summarize_short_term(self) -> str:
        """Summarize short-term memory for long-term storage.
        
        In a real implementation, this would use an LLM to generate a summary.
        This is a simplified placeholder.
        """
        # Placeholder for LLM-based summarization
        return f"Summary of {len(self.short_term_memory)} recent interactions"
    
    def commit_short_term_to_long_term(self) -> None:
        """Summarize short-term memory and commit to long-term memory."""
        if not self.short_term_memory:
            return
        
        summary = self.summarize_short_term()
        self.add_to_long_term({
            "type": "conversation_summary",
            "summary": summary,
            "original_items": self.short_term_memory.copy()
        })

# Example usage
memory = MemorySystem()

# Add user interaction to short-term memory
memory.add_to_short_term({
    "type": "user_message",
    "content": "I need to find information about climate change impacts."
})

# Update working memory with current task
memory.update_working_memory("current_task", "research_climate_change")
memory.update_working_memory("search_queries_used", ["climate change impacts", "sea level rise"])

# Add some knowledge to long-term memory
memory.add_to_long_term({
    "type": "learned_fact",
    "topic": "climate_change",
    "fact": "Global sea levels rose about 8-9 inches since 1880."
})

# Get relevant context for a query
context = memory.get_relevant_context("climate change sea level")
print(json.dumps(context, indent=2))

# At the end of a session, commit short-term to long-term
memory.commit_short_term_to_long_term()

4. Planning and Execution

Planning and execution systems enable agents to break down complex tasks into manageable steps and carry them out effectively:

from typing import List, Dict, Any, Optional
from enum import Enum
import uuid

class TaskStatus(Enum):
    PENDING = "pending"
    IN_PROGRESS = "in_progress"
    COMPLETED = "completed"
    FAILED = "failed"

class Task:
    def __init__(self, description: str, dependencies: List[str] = None):
        self.id = str(uuid.uuid4())[:8]
        self.description = description
        self.dependencies = dependencies or []
        self.status = TaskStatus.PENDING
        self.result = None
        self.error = None
    
    def to_dict(self) -> Dict[str, Any]:
        return {
            "id": self.id,
            "description": self.description,
            "dependencies": self.dependencies,
            "status": self.status.value,
            "result": self.result,
            "error": self.error
        }

class Plan:
    def __init__(self, goal: str):
        self.id = str(uuid.uuid4())[:8]
        self.goal = goal
        self.tasks = {}  # id -> Task
        self.created_at = datetime.now().isoformat()
    
    def add_task(self, description: str, dependencies: List[str] = None) -> str:
        """Add a task to the plan and return its ID."""
        task = Task(description, dependencies)
        self.tasks[task.id] = task
        return task.id
    
    def get_next_tasks(self) -> List[Task]:
        """Get tasks that are ready to be executed (all dependencies satisfied)."""
        next_tasks = []
        for task in self.tasks.values():
            if task.status == TaskStatus.PENDING:
                dependencies_met = True
                for dep_id in task.dependencies:
                    if dep_id not in self.tasks or self.tasks[dep_id].status != TaskStatus.COMPLETED:
                        dependencies_met = False
                        break
                
                if dependencies_met:
                    next_tasks.append(task)
        
        return next_tasks
    
    def update_task(self, task_id: str, status: TaskStatus, result: Any = None, error: str = None) -> None:
        """Update the status and result of a task."""
        if task_id not in self.tasks:
            raise ValueError(f"Task {task_id} not found in plan")
        
        task = self.tasks[task_id]
        task.status = status
        task.result = result
        task.error = error
    
    def is_completed(self) -> bool:
        """Check if all tasks in the plan are completed."""
        return all(task.status == TaskStatus.COMPLETED for task in self.tasks.values())
    
    def has_failed_tasks(self) -> bool:
        """Check if any tasks have failed."""
        return any(task.status == TaskStatus.FAILED for task in self.tasks.values())
    
    def get_summary(self) -> Dict[str, Any]:
        """Get a summary of the plan's status."""
        total = len(self.tasks)
        completed = sum(1 for task in self.tasks.values() if task.status == TaskStatus.COMPLETED)
        in_progress = sum(1 for task in self.tasks.values() if task.status == TaskStatus.IN_PROGRESS)
        pending = sum(1 for task in self.tasks.values() if task.status == TaskStatus.PENDING)
        failed = sum(1 for task in self.tasks.values() if task.status == TaskStatus.FAILED)
        
        return {
            "id": self.id,
            "goal": self.goal,
            "total_tasks": total,
            "completed": completed,
            "in_progress": in_progress,
            "pending": pending,
            "failed": failed,
            "is_completed": self.is_completed(),
            "has_failed": self.has_failed_tasks()
        }
    
    def to_dict(self) -> Dict[str, Any]:
        """Convert the plan to a dictionary."""
        return {
            "id": self.id,
            "goal": self.goal,
            "tasks": {task_id: task.to_dict() for task_id, task in self.tasks.items()},
            "created_at": self.created_at,
            "summary": self.get_summary()
        }

class PlanningSystem:
    def __init__(self, llm):
        self.llm = llm
        self.plans = {}  # plan_id -> Plan
    
    def create_plan(self, goal: str) -> str:
        """Create a new plan for a goal and return its ID."""
        # Use LLM to break down the goal into tasks
        planning_prompt = f"""
        Create a step-by-step plan to achieve this goal: {goal}
        
        For each step, consider:
        1. What needs to be done
        2. What information is needed
        3. What dependencies exist (which steps must be completed first)
        
        Format your response as a numbered list of steps.
        """
        
        plan_text = self.llm.predict(planning_prompt)
        
        # Create a new plan
        plan = Plan(goal)
        
        # Parse the plan text and add tasks
        # This is a simplified parser that assumes a numbered list
        lines = plan_text.strip().split('\n')
        task_map = {}  # step number -> task_id
        
        for line in lines:
            line = line.strip()
            if not line:
                continue
            
            # Try to extract step number
            parts = line.split('.', 1)
            if len(parts) == 2 and parts[0].strip().isdigit():
                step_num = int(parts[0].strip())
                description = parts[1].strip()
                
                # Assume dependencies are previous steps
                dependencies = [task_map[i] for i in range(1, step_num) if i in task_map]
                
                task_id = plan.add_task(description, dependencies)
                task_map[step_num] = task_id
        
        # Store the plan
        self.plans[plan.id] = plan
        return plan.id
    
    def execute_plan(self, plan_id: str, tool_framework) -> Dict[str, Any]:
        """Execute a plan using the provided tool framework."""
        if plan_id not in self.plans:
            return {"error": f"Plan {plan_id} not found"}
        
        plan = self.plans[plan_id]
        
        # Continue executing until plan is completed or all available tasks are attempted
        while not plan.is_completed() and not plan.has_failed_tasks():
            next_tasks = plan.get_next_tasks()
            if not next_tasks:
                break
            
            # Execute each ready task
            for task in next_tasks:
                # Update task status
                plan.update_task(task.id, TaskStatus.IN_PROGRESS)
                
                # Determine which tool to use and how to use it
                tool_selection_prompt = f"""
                Task: {task.description}
                
                Available tools:
                {tool_framework.get_tool_descriptions()}
                
                Which tool should be used for this task? If no tool is needed, respond with "NONE".
                If a tool is needed, specify the tool name and the parameters in this format:
                TOOL: [tool_name]
                PARAMS: [JSON formatted parameters]
                """
                
                tool_response = self.llm.predict(tool_selection_prompt)
                
                # Parse the tool response
                tool_name = None
                tool_params = {}
                
                if "TOOL:" in tool_response:
                    tool_lines = tool_response.split('\n')
                    for line in tool_lines:
                        if line.startswith("TOOL:"):
                            tool_name = line.replace("TOOL:", "").strip()
                        elif line.startswith("PARAMS:"):
                            params_text = line.replace("PARAMS:", "").strip()
                            try:
                                tool_params = json.loads(params_text)
                            except:
                                # If JSON parsing fails, use an empty dict
                                tool_params = {}
                
                # Execute the tool if specified
                if tool_name and tool_name != "NONE":
                    try:
                        result = tool_framework.execute_tool(tool_name, **tool_params)
                        plan.update_task(task.id, TaskStatus.COMPLETED, result=result)
                    except Exception as e:
                        plan.update_task(task.id, TaskStatus.FAILED, error=str(e))
                else:
                    # No tool needed, mark as completed
                    plan.update_task(task.id, TaskStatus.COMPLETED)
        
        return plan.to_dict()
    
    def get_plan(self, plan_id: str) -> Optional[Dict[str, Any]]:
        """Get a plan by ID."""
        if plan_id not in self.plans:
            return None
        return self.plans[plan_id].to_dict()

# Example usage (assuming llm and tool_framework are defined)
# planning_system = PlanningSystem(llm)
# plan_id = planning_system.create_plan("Research the impact of climate change on agriculture")
# plan_result = planning_system.execute_plan(plan_id, tool_framework)
# print(json.dumps(plan_result, indent=2))

5. User Interaction Interface

The user interaction interface enables communication between the agent and its users:

from typing import List, Dict, Any, Optional, Callable
import json

class UserInteractionInterface:
    def __init__(self, llm, memory_system):
        self.llm = llm
        self.memory_system = memory_system
        self.response_formatters = {
            "text": self._format_text_response,
            "search_results": self._format_search_results,
            "error": self._format_error_response,
            "plan": self._format_plan_response,
            "clarification": self._format_clarification_request
        }
    
    def process_user_input(self, user_input: str) -> Dict[str, Any]:
        """Process user input and store in memory."""
        # Store in memory
        self.memory_system.add_to_short_term({
            "type": "user_input",
            "content": user_input
        })
        
        # Analyze the input
        analysis_prompt = f"""
        Analyze this user input: "{user_input}"
        
        Determine:
        1. The primary intent (question, command, clarification, etc.)
        2. The main topic or subject
        3. Any specific constraints or preferences mentioned
        4. Whether additional information is needed from the user
        
        Format your response as JSON with these fields:
        {{
            "intent": "string",
            "topic": "string",
            "constraints": ["string"],
            "needs_clarification": boolean,
            "clarification_question": "string" (if needs_clarification is true)
        }}
        """
        
        analysis_response = self.llm.predict(analysis_prompt)
        
        # Parse the JSON response
        try:
            analysis = json.loads(analysis_response)
        except:
            # Fallback if JSON parsing fails
            analysis = {
                "intent": "unknown",
                "topic": "unknown",
                "constraints": [],
                "needs_clarification": False
            }
        
        # Store analysis in working memory
        self.memory_system.update_working_memory("current_intent", analysis["intent"])
        self.memory_system.update_working_memory("current_topic", analysis["topic"])
        
        return analysis
    
    def generate_response(self, response_type: str, content: Any) -> str:
        """Generate a formatted response based on type and content."""
        if response_type in self.response_formatters:
            formatter = self.response_formatters[response_type]
            response = formatter(content)
        else:
            # Default to text response
            response = str(content)
        
        # Store in memory
        self.memory_system.add_to_short_term({
            "type": "agent_response",
            "response_type": response_type,
            "content": response
        })
        
        return response
    
    def request_clarification(self, question: str) -> str:
        """Generate a clarification request."""
        return self.generate_response("clarification", question)
    
    def _format_text_response(self, content: str) -> str:
        """Format a simple text response."""
        return content
    
    def _format_search_results(self, results: List[Dict[str, str]]) -> str:
        """Format search results."""
        if not results:
            return "I couldn't find any relevant information."
        
        formatted = "Here's what I found:\n\n"
        for i, result in enumerate(results, 1):
            formatted += f"{i}. {result['title']}\n   {result['url']}\n"
        
        return formatted
    
    def _format_error_response(self, error: str) -> str:
        """Format an error response."""
        return f"I encountered an issue: {error}\n\nCould you try rephrasing your request or providing more information?"
    
    def _format_plan_response(self, plan: Dict[str, Any]) -> str:
        """Format a plan response."""
        formatted = f"I've created a plan to achieve your goal: {plan['goal']}\n\n"
        
        # Add summary
        summary = plan['summary']
        formatted += f"Progress: {summary['completed']}/{summary['total_tasks']} tasks completed\n\n"
        
        # Add tasks
        formatted += "Tasks:\n"
        for task_id, task in plan['tasks'].items():
            status_symbol = "✓" if task['status'] == "completed" else "⋯" if task['status'] == "in_progress" else "✗" if task['status'] == "failed" else "○"
            formatted += f"{status_symbol} {task['description']}\n"
            if task['result']:
                formatted += f"   Result: {task['result']}\n"
            if task['error']:
                formatted += f"   Error: {task['error']}\n"
        
        return formatted
    
    def _format_clarification_request(self, question: str) -> str:
        """Format a clarification request."""
        return f"To better assist you, I need some additional information:\n\n{question}"

# Example usage (assuming llm and memory_system are defined)
# interface = UserInteractionInterface(llm, memory_system)
# analysis = interface.process_user_input("Can you help me find information about renewable energy?")
# 
# if analysis["needs_clarification"]:
#     response = interface.request_clarification(analysis["clarification_question"])
# else:
#     # Simulate search results
#     search_results = [
#         {"title": "Renewable Energy Explained", "url": "https://example.com/renewable-energy"},
#         {"title": "Solar and Wind Power Basics", "url": "https://example.com/solar-wind"}
#     ]
#     response = interface.generate_response("search_results", search_results)
# 
# print(response)

Building Your First AI Agent

Let's put everything together to build a simple but functional AI agent that can help with research tasks.

Step 1: Define the Agent's Purpose and Capabilities

Research Assistant Agent Specification:

  • Purpose: Help users find, summarize, and synthesize information on specific topics
  • Capabilities:
    • Search for information online
    • Extract key points from articles and websites
    • Summarise findings in structured formats
    • Answer questions based on gathered information
  • Tools:
    • Web search
    • Web browsing
    • Text extraction
    • Summarisation

Step 2: Implement the Core Components

import os
import json
import requests
from typing import List, Dict, Any, Optional
from pydantic import BaseModel, Field
from datetime import datetime

# For simplicity, we'll use a mock LLM class
class MockLLM:
    def predict(self, prompt: str) -> str:
        """Simulate LLM prediction (in a real implementation, this would call an actual LLM API)."""
        # This is just a placeholder that returns a simple response based on the prompt
        if "search" in prompt.lower():
            return "I'll search for that information."
        elif "summarise" in prompt.lower():
            return "Here's a summary of the key points..."
        elif "analyze" in prompt.lower():
            return "Based on my analysis..."
        else:
            return "I understand your request and will help with that."

# Tool definitions
class SearchParameters(BaseModel):
    query: str = Field(..., description="The search query")
    num_results: int = Field(5, description="Number of results to return")

def search_function(query: str, num_results: int = 5) -> List[Dict[str, str]]:
    """Simulated search function."""
    # In a real implementation, this would call a search API
    return [{"title": f"Result {i} for {query}", "url": f"https://example.com/{i}"} for i in range(num_results)]

class BrowseParameters(BaseModel):
    url: str = Field(..., description="The URL to browse")

def browse_function(url: str) -> str:
    """Simulated web browsing function."""
    # In a real implementation, this would fetch and parse a webpage
    return f"Content from {url}: This is simulated webpage content for demonstration purposes."

class SummariseParameters(BaseModel):
    text: str = Field(..., description="The text to summarise")
    max_length: int = Field(200, description="Maximum length of the summary")

def summarise_function(text: str, max_length: int = 200) -> str:
    """Simulated text summarisation function."""
    # In a real implementation, this would use an LLM to summarise text
    if len(text) > max_length:
        return text[:max_length] + "..."
    return text

# Tool class
class Tool:
    def __init__(self, name: str, description: str, function, schema: BaseModel = None):
        self.name = name
        self.description = description
        self.function = function
        self.schema = schema
    
    def __call__(self, **kwargs):
        """Execute the tool with the provided arguments."""
        # Validate arguments if schema is provided
        if self.schema:
            validated_args = self.schema(**kwargs).dict()
            return self.function(**validated_args)
        return self.function(**kwargs)
    
    def get_schema(self) -> Dict[str, Any]:
        """Get the JSON schema for this tool."""
        if self.schema:
            schema_dict = self.schema.schema()
            return {
                "name": self.name,
                "description": self.description,
                "parameters": schema_dict
            }
        return {
            "name": self.name,
            "description": self.description,
            "parameters": {"type": "object", "properties": {}}
        }

# Memory system
class MemorySystem:
    def __init__(self, max_short_term_items: int = 10):
        self.short_term_memory = []  # Recent interactions
        self.working_memory = {}     # Current task state
        self.long_term_memory = []   # Persistent knowledge
        self.max_short_term_items = max_short_term_items
    
    def add_to_short_term(self, item: Dict[str, Any]) -> None:
        """Add an item to short-term memory."""
        # Add timestamp if not present
        if "timestamp" not in item:
            item["timestamp"] = datetime.now().isoformat()
        
        self.short_term_memory.append(item)
        
        # Trim if exceeding max size
        if len(self.short_term_memory) > self.max_short_term_items:
            self.short_term_memory = self.short_term_memory[-self.max_short_term_items:]
    
    def update_working_memory(self, key: str, value: Any) -> None:
        """Update a value in working memory."""
        self.working_memory[key] = value
    
    def get_short_term_memory(self) -> List[Dict[str, Any]]:
        """Get all items in short-term memory."""
        return self.short_term_memory
    
    def get_working_memory(self) -> Dict[str, Any]:
        """Get all items in working memory."""
        return self.working_memory

# Research Assistant Agent
class ResearchAssistantAgent:
    def __init__(self):
        # Initialize LLM
        self.llm = MockLLM()
        
        # Initialize memory
        self.memory = MemorySystem()
        
        # Initialize tools
        self.tools = {
            "search": Tool(
                name="search",
                description="Search the web for information",
                function=search_function,
                schema=SearchParameters
            ),
            "browse": Tool(
                name="browse",
                description="Browse a specific webpage",
                function=browse_function,
                schema=BrowseParameters
            ),
            "summarise": Tool(
                name="summarise",
                description="Summarise a piece of text",
                function=summarise_function,
                schema=SummariseParameters
            )
        }
    
    def get_tool_descriptions(self) -> str:
        """Get formatted descriptions of all available tools."""
        descriptions = []
        for name, tool in self.tools.items():
            descriptions.append(f"{name}: {tool.description}")
        return "\n".join(descriptions)
    
    def process_user_input(self, user_input: str) -> str:
        """Process user input and generate a response."""
        # Store user input in memory
        self.memory.add_to_short_term({
            "type": "user_input",
            "content": user_input
        })
        
        # Determine the appropriate action
        action_prompt = f"""
        User input: "{user_input}"
        
        Based on this input, determine what action to take.
        
        Available tools:
        {self.get_tool_descriptions()}
        
        If a tool should be used, respond in this format:
        TOOL: [tool_name]
        PARAMS: [JSON formatted parameters]
        
        If no tool is needed, respond with:
        RESPONSE: [Your direct response to the user]
        """
        
        action_decision = self.llm.predict(action_prompt)
        
        # Parse the action decision
        if "TOOL:" in action_decision:
            # Extract tool name and parameters
            tool_name = None
            tool_params = {}
            
            lines = action_decision.split('\n')
            for line in lines:
                if line.startswith("TOOL:"):
                    tool_name = line.replace("TOOL:", "").strip()
                elif line.startswith("PARAMS:"):
                    params_text = line.replace("PARAMS:", "").strip()
                    try:
                        tool_params = json.loads(params_text)
                    except:
                        # If JSON parsing fails, use an empty dict
                        tool_params = {}
            
            # Execute the tool
            if tool_name in self.tools:
                try:
                    tool_result = self.tools[tool_name](**tool_params)
                    
                    # Store tool execution in memory
                    self.memory.add_to_short_term({
                        "type": "tool_execution",
                        "tool": tool_name,
                        "params": tool_params,
                        "result": tool_result
                    })
                    
                    # Generate response based on tool result
                    response_prompt = f"""
                    User input: "{user_input}"
                    
                    Tool used: {tool_name}
                    Tool result: {tool_result}
                    
                    Generate a helpful response to the user based on this information.
                    """
                    
                    response = self.llm.predict(response_prompt)
                    
                    # Store response in memory
                    self.memory.add_to_short_term({
                        "type": "agent_response",
                        "content": response
                    })
                    
                    return response
                except Exception as e:
                    error_message = f"Error executing tool {tool_name}: {e}"
                    
                    # Store error in memory
                    self.memory.add_to_short_term({
                        "type": "error",
                        "tool": tool_name,
                        "error": str(e)
                    })
                    
                    return error_message
            else:
                return f"Tool {tool_name} not found. Available tools: {', '.join(self.tools.keys())}"
        elif "RESPONSE:" in action_decision:
            # Extract direct response
            response_lines = action_decision.split("RESPONSE:")
            if len(response_lines) > 1:
                response = response_lines[1].strip()
                
                # Store response in memory
                self.memory.add_to_short_term({
                    "type": "agent_response",
                    "content": response
                })
                
                return response
        
        # Fallback response
        fallback_response = "I'm not sure how to help with that. Could you provide more details or rephrase your request?"
        
        # Store fallback response in memory
        self.memory.add_to_short_term({
            "type": "agent_response",
            "content": fallback_response
        })
        
        return fallback_response
    
    def get_conversation_history(self) -> List[Dict[str, Any]]:
        """Get the conversation history from memory."""
        return [item for item in self.memory.get_short_term_memory() 
                if item["type"] in ["user_input", "agent_response"]]

# Example usage
agent = ResearchAssistantAgent()

# Simulate a conversation
responses = []
responses.append(agent.process_user_input("I need to research the impact of artificial intelligence on healthcare"))
responses.append(agent.process_user_input("Can you find information about AI diagnostic tools?"))
responses.append(agent.process_user_input("Summarise the key benefits of AI in healthcare"))

# Print conversation history
print("Conversation History:")
for item in agent.get_conversation_history():
    if item["type"] == "user_input":
        print(f"User: {item['content']}")
    else:
        print(f"Agent: {item['content']}")

Step 3: Enhance with Advanced Features

Once you have a basic agent working, you can enhance it with more advanced features:

Advanced Features to Add:

  • Multi-step Planning: Break complex research tasks into steps
  • Source Tracking: Keep track of where information came from
  • Fact Verification: Cross-check information across multiple sources
  • Personalisation: Remember user preferences and adapt accordingly
  • Visualisation Generation: Create charts or diagrams to illustrate findings

Implementation Tips

  • Start Simple: Begin with core functionality and add features incrementally
  • Test Thoroughly: Test each component individually before integration
  • Monitor Performance: Track response quality, tool usage, and error rates
  • Gather Feedback: Use real user interactions to identify improvement areas
  • Iterate Rapidly: Continuously refine based on testing and feedback

Common AI Agent Patterns and Anti-Patterns

Understanding common patterns and anti-patterns will help you build more effective AI agents.

Effective Patterns

Successful AI Agent Patterns:

Pattern Description Benefits
Separation of Concerns Divide agent functionality into distinct components Modularity, maintainability, easier testing
Progressive Disclosure Reveal capabilities and options gradually as needed Reduced cognitive load, focused interactions
Explicit Reasoning Make reasoning process visible to users Transparency, trust, easier debugging
Graceful Degradation Maintain functionality when optimal resources unavailable Reliability, consistent user experience
Contextual Memory Maintain relevant context without overwhelming the system Coherent conversations, personalized responses

Anti-Patterns to Avoid

AI Agent Anti-Patterns:

Anti-Pattern Description Consequences
Monolithic Design Building the entire agent as a single, tightly-coupled system Difficult to maintain, test, or extend
Capability Overload Adding too many features without clear organization User confusion, diluted effectiveness
Excessive Autonomy Giving agents too much freedom without appropriate guardrails Unpredictable behavior, potential harmful actions
Context Flooding Providing too much context to the LLM Token waste, diluted focus, increased costs
Tool Proliferation Adding too many similar or overlapping tools Decision paralysis, inefficient tool selection

Next Steps in Your AI Journey

Now that you understand the fundamentals of AI agents, you're ready to explore more advanced agent design patterns and architectures.

Key Takeaways from This Section:

  • AI agents combine LLMs with tools, memory, and planning to create autonomous systems
  • The agent cognition loop (Observe, Orient, Decide, Act, Learn) drives agent behaviour
  • Different types of agents serve different purposes, from assistive to fully autonomous
  • Core components include the LLM, tools, memory, and planning module
  • Benefits include automation and efficiency, while challenges include reliability, control, and security
  • Ethical development requires transparency, accountability, bias mitigation, and user control

In the next section, we delve into specific Agentic Design Patterns, providing blueprints for constructing robust and effective AI agents for various applications.

Continue to Agentic Design Patterns →