Introduction to AI Agents

Understand the fundamentals of AI agents and how they're transforming the way we interact with technology

Understanding AI Agents

AI agents represent the next evolution in artificial intelligence, moving beyond passive models that simply respond to prompts toward autonomous systems that can perceive, reason, plan, and act to achieve specific goals.

Key Insight

AI agents combine the reasoning capabilities of large language models with the ability to interact with the external world through tools, creating systems that can autonomously solve complex problems and perform tasks on behalf of users.

What Makes an AI Agent?

An AI agent is characterised by several key capabilities that distinguish it from traditional AI systems:

Autonomy: Ability to operate independently with minimal human supervision
Goal-Orientation: Working toward specific objectives rather than just responding to prompts
Tool Use: Leveraging external tools and APIs to interact with the world
Memory: Maintaining context and learning from past interactions
Planning: Breaking down complex tasks into manageable steps
Adaptability: Adjusting strategies based on feedback and changing conditions

                The Evolution of AI Systems
                
                        System Type
                        Characteristics
                        Examples
                    
                        Traditional AI
                        Rule-based, narrow focus, explicit programming
                        Expert systems, chess engines, traditional chatbots
                    
                        Machine Learning
                        Data-driven, pattern recognition, statistical models
                        Recommendation systems, image classifiers, predictive analytics
                    
                        Large Language Models
                        Generative, broad knowledge, natural language understanding
                        ChatGPT, Claude, Llama, text generation systems
                    
                        AI Agents
                        Autonomous, goal-oriented, tool-using, adaptive
                        Personal assistants, autonomous researchers, workflow automators

System Type	Characteristics	Examples
Traditional AI	Rule-based, narrow focus, explicit programming	Expert systems, chess engines, traditional chatbots
Machine Learning	Data-driven, pattern recognition, statistical models	Recommendation systems, image classifiers, predictive analytics
Large Language Models	Generative, broad knowledge, natural language understanding	ChatGPT, Claude, Llama, text generation systems
AI Agents	Autonomous, goal-oriented, tool-using, adaptive	Personal assistants, autonomous researchers, workflow automators

The Agent Cognition Loop

At the core of every AI agent is the agent cognition loop—a continuous cycle of perception, reasoning, planning, and action that allows the agent to interact with its environment and work toward its goals.

                The Basic Agent Loop:
                Observe: Gather information from the environment or user input
Orient: Interpret the information and update internal state
Decide: Determine the best course of action based on goals and current state
Act: Execute the chosen action using available tools
Learn: Update knowledge and strategies based on outcomes

            

Agent Loop Implementation

class AIAgent:
    def __init__(self, llm, tools, memory=None):
        self.llm = llm
        self.tools = {tool.name: tool for tool in tools}
        self.memory = memory or []
        self.state = {}
    
    def run(self, user_input):
        """Execute the agent cognition loop."""
        # 1. Observe - gather input
        observation = self._observe(user_input)
        
        # 2. Orient - interpret and update state
        self._orient(observation)
        
        # Continue the loop until goal is reached or max iterations
        max_iterations = 10
        for _ in range(max_iterations):
            # 3. Decide - determine next action
            action = self._decide()
            
            # 4. Act - execute the action
            result = self._act(action)
            
            # 5. Learn - update knowledge based on result
            self._learn(action, result)
            
            # Check if we've reached the goal
            if self._goal_achieved():
                break
            
            # Update state with new observation
            self._orient(result)
        
        return self._generate_response()
    
    def _observe(self, user_input):
        """Gather information from user input."""
        return {"type": "user_input", "content": user_input}
    
    def _orient(self, observation):
        """Interpret the observation and update internal state."""
        # Add observation to memory
        self.memory.append(observation)
        
        # Update state based on observation
        if observation["type"] == "user_input":
            self.state["current_goal"] = self._extract_goal(observation["content"])
        elif observation["type"] == "tool_result":
            self.state["last_tool_result"] = observation["content"]
    
    def _decide(self):
        """Determine the next action based on current state."""
        # Construct prompt with memory and state
        prompt = self._construct_decision_prompt()
        
        # Get decision from LLM
        response = self.llm.predict(prompt)
        
        # Parse the response to extract tool name and arguments
        tool_name, tool_args = self._parse_tool_call(response)
        
        return {"tool": tool_name, "args": tool_args}
    
    def _act(self, action):
        """Execute the chosen action using available tools."""
        tool_name = action["tool"]
        tool_args = action["args"]
        
        if tool_name in self.tools:
            tool = self.tools[tool_name]
            try:
                result = tool(**tool_args)
                return {"type": "tool_result", "tool": tool_name, "content": result}
            except Exception as e:
                return {"type": "error", "tool": tool_name, "content": str(e)}
        else:
            return {"type": "error", "content": f"Tool {tool_name} not found"}
    
    def _learn(self, action, result):
        """Update knowledge and strategies based on action results."""
        # Add action and result to memory
        self.memory.append({"type": "action", "content": action})
        self.memory.append(result)
        
        # Update state based on result
        if result["type"] == "error":
            self.state["last_error"] = result["content"]
        
        # Could implement more sophisticated learning here
    
    def _goal_achieved(self):
        """Check if the current goal has been achieved."""
        # Construct prompt to check goal completion
        prompt = self._construct_goal_check_prompt()
        
        # Get assessment from LLM
        response = self.llm.predict(prompt)
        
        # Parse response to determine if goal is achieved
        return "GOAL_ACHIEVED" in response
    
    def _generate_response(self):
        """Generate a final response to the user."""
        # Construct prompt for response generation
        prompt = self._construct_response_prompt()
        
        # Get response from LLM
        return self.llm.predict(prompt)
    
    # Helper methods
    def _extract_goal(self, user_input):
        """Extract the user's goal from their input."""
        prompt = f"Extract the user's goal from their input: {user_input}"
        return self.llm.predict(prompt)
    
    def _construct_decision_prompt(self):
        """Construct a prompt for the decision phase."""
        # Implementation details omitted for brevity
        pass
    
    def _parse_tool_call(self, llm_response):
        """Parse the LLM's response to extract tool name and arguments."""
        # Implementation details omitted for brevity
        pass
    
    def _construct_goal_check_prompt(self):
        """Construct a prompt to check if the goal has been achieved."""
        # Implementation details omitted for brevity
        pass
    
    def _construct_response_prompt(self):
        """Construct a prompt for generating the final response."""
        # Implementation details omitted for brevity
        pass

Types of AI Agents

AI agents come in various forms, each designed for specific types of tasks and interaction patterns.

By Autonomy Level

                Autonomy Spectrum:
                
                    
                        Agent Type
                        Autonomy Level
                        Human Involvement
                        Best For
                    

                        Assistive Agents
                        Low
                        Frequent guidance and confirmation
                        High-stakes decisions, creative tasks, personalised assistance
                    

                        Collaborative Agents
                        Medium
                        Occasional input and direction
                        Complex problem-solving, research, content creation
                    

                        Autonomous Agents
                        High
                        Initial setup and periodic review
                        Routine tasks, monitoring, data processing, scheduled actions
                    

                        Fully Autonomous Systems
                        Very High
                        Oversight only
                        Continuous operations, real-time responses, system management
                    

            

Agent Type	Autonomy Level	Human Involvement	Best For
Assistive Agents	Low	Frequent guidance and confirmation	High-stakes decisions, creative tasks, personalised assistance
Collaborative Agents	Medium	Occasional input and direction	Complex problem-solving, research, content creation
Autonomous Agents	High	Initial setup and periodic review	Routine tasks, monitoring, data processing, scheduled actions
Fully Autonomous Systems	Very High	Oversight only	Continuous operations, real-time responses, system management

By Functional Role

                Common Agent Roles:
                Personal Assistants: Help individuals with daily tasks, scheduling, information retrieval
Research Agents: Gather, analyze, and synthesize information from multiple sources
Creative Agents: Generate content, designs, or creative works based on specifications
Workflow Agents: Automate business processes and coordinate tasks across systems
Customer Service Agents: Handle inquiries, troubleshoot issues, and provide support
Data Agents: Process, analyze, and extract insights from large datasets
DevOps Agents: Monitor systems, detect issues, and manage infrastructure
Learning Agents: Personalize educational content and provide tutoring

            

By Architecture

                Agent Architectures:
                
                    
                        Architecture
                        Description
                        Advantages
                        Limitations
                    

                        Single-LLM Agents
                        One LLM handles all reasoning and decision-making
                        Simple, coherent reasoning, easy to implement
                        Limited by context window, potential for hallucination
                    

                        Multi-LLM Agents
                        Different LLMs handle specialized tasks
                        Specialized expertise, cost optimization
                        Coordination overhead, potential inconsistencies
                    

                        Hierarchical Agents
                        Manager agents delegate to specialized sub-agents
                        Complex task handling, separation of concerns
                        Complex implementation, communication overhead
                    

                        Multi-Agent Systems
                        Multiple agents collaborate to solve problems
                        Parallel processing, diverse perspectives
                        Coordination challenges, resource intensive
                    

                        Hybrid Symbolic-Neural Agents
                        Combine LLMs with symbolic reasoning systems
                        Reliable reasoning, verifiable outputs
                        Implementation complexity, integration challenges
                    

            

Architecture	Description	Advantages	Limitations
Single-LLM Agents	One LLM handles all reasoning and decision-making	Simple, coherent reasoning, easy to implement	Limited by context window, potential for hallucination
Multi-LLM Agents	Different LLMs handle specialized tasks	Specialized expertise, cost optimization	Coordination overhead, potential inconsistencies
Hierarchical Agents	Manager agents delegate to specialized sub-agents	Complex task handling, separation of concerns	Complex implementation, communication overhead
Multi-Agent Systems	Multiple agents collaborate to solve problems	Parallel processing, diverse perspectives	Coordination challenges, resource intensive
Hybrid Symbolic-Neural Agents	Combine LLMs with symbolic reasoning systems	Reliable reasoning, verifiable outputs	Implementation complexity, integration challenges

Core Components of AI Agents

Effective AI agents are built from several essential components that work together to enable autonomous operation.

1. Reasoning Engine

The reasoning engine is typically a large language model that provides the cognitive capabilities for the agent, including:

Natural Language Understanding: Interpreting user requests and information
Problem Solving: Breaking down complex tasks into steps
Decision Making: Choosing appropriate actions based on context
Explanation Generation: Providing rationales for decisions

Reasoning Engine Selection

Choose your reasoning engine based on the agent's requirements:

GPT-4 or Claude 3 Opus: For complex reasoning and sophisticated agents
GPT-3.5 or Claude 3 Sonnet: For simpler agents with good balance of cost and performance
Llama 3 or Mistral: For locally-deployed agents with privacy requirements

2. Tool Use Framework

The tool use framework enables agents to interact with external systems, APIs, and data sources:

Tool Definition: Specifying available tools and their parameters
Tool Selection: Choosing the right tool for a given task
Parameter Preparation: Formatting inputs correctly for tools
Result Handling: Processing and interpreting tool outputs

from typing import List, Dict, Any, Callable
from pydantic import BaseModel, Field

# Define a tool using Pydantic for parameter validation
class Tool:
    def __init__(self, name: str, description: str, function: Callable, schema: BaseModel = None):
        self.name = name
        self.description = description
        self.function = function
        self.schema = schema
    
    def __call__(self, **kwargs):
        """Execute the tool with the provided arguments."""
        # Validate arguments if schema is provided
        if self.schema:
            validated_args = self.schema(**kwargs).dict()
            return self.function(**validated_args)
        return self.function(**kwargs)
    
    def get_schema(self) -> Dict[str, Any]:
        """Get the JSON schema for this tool."""
        if self.schema:
            schema_dict = self.schema.schema()
            return {
                "name": self.name,
                "description": self.description,
                "parameters": schema_dict
            }
        return {
            "name": self.name,
            "description": self.description,
            "parameters": {"type": "object", "properties": {}}
        }

# Example tool definitions
class SearchParameters(BaseModel):
    query: str = Field(..., description="The search query")
    num_results: int = Field(5, description="Number of results to return")

def search_function(query: str, num_results: int = 5) -> List[Dict[str, str]]:
    """Simulated search function."""
    # In a real implementation, this would call a search API
    return [{"title": f"Result {i} for {query}", "url": f"https://example.com/{i}"} for i in range(num_results)]

search_tool = Tool(
    name="search",
    description="Search the web for information. Use this when you need to find facts or current information.",
    function=search_function,
    schema=SearchParameters
)

class CalculatorParameters(BaseModel):
    expression: str = Field(..., description="Mathematical expression to evaluate")

def calculator_function(expression: str) -> float:
    """Evaluate a mathematical expression."""
    try:
        return eval(expression)
    except Exception as e:
        return f"Error: {str(e)}"

calculator_tool = Tool(
    name="calculator",
    description="Evaluate mathematical expressions. Use this for calculations.",
    function=calculator_function,
    schema=CalculatorParameters
)

# Tool use framework
class ToolUseFramework:
    def __init__(self, tools: List[Tool]):
        self.tools = {tool.name: tool for tool in tools}
    
    def get_tool_descriptions(self) -> str:
        """Get formatted descriptions of all available tools."""
        descriptions = []
        for name, tool in self.tools.items():
            descriptions.append(f"{name}: {tool.description}")
        return "\n".join(descriptions)
    
    def get_tool_schemas(self) -> List[Dict[str, Any]]:
        """Get JSON schemas for all tools."""
        return [tool.get_schema() for tool in self.tools.values()]
    
    def execute_tool(self, tool_name: str, **kwargs) -> Any:
        """Execute a tool with the provided arguments."""
        if tool_name not in self.tools:
            return f"Error: Tool '{tool_name}' not found. Available tools: {', '.join(self.tools.keys())}"
        
        tool = self.tools[tool_name]
        try:
            return tool(**kwargs)
        except Exception as e:
            return f"Error executing {tool_name}: {str(e)}"

# Example usage
tools = [search_tool, calculator_tool]
tool_framework = ToolUseFramework(tools)

# Get tool descriptions for prompts
tool_descriptions = tool_framework.get_tool_descriptions()
print(tool_descriptions)

# Execute a tool
result = tool_framework.execute_tool("calculator", expression="2 + 2 * 3")
print(result)  # Output: 8

3. Memory Systems

Memory systems allow agents to maintain context, learn from past interactions, and build knowledge over time:

Short-term Memory: Recent conversation history and current context
Working Memory: Active information needed for the current task
Long-term Memory: Persistent knowledge and learned patterns
Episodic Memory: Records of past interactions and their outcomes

from typing import List, Dict, Any, Optional
import time
import json
from datetime import datetime

class MemorySystem:
    def __init__(self, max_short_term_items: int = 10):
        self.short_term_memory = []  # Recent interactions
        self.working_memory = {}     # Current task state
        self.long_term_memory = []   # Persistent knowledge
        self.max_short_term_items = max_short_term_items
    
    def add_to_short_term(self, item: Dict[str, Any]) -> None:
        """Add an item to short-term memory."""
        # Add timestamp if not present
        if "timestamp" not in item:
            item["timestamp"] = datetime.now().isoformat()
        
        self.short_term_memory.append(item)
        
        # Trim if exceeding max size
        if len(self.short_term_memory) > self.max_short_term_items:
            self.short_term_memory = self.short_term_memory[-self.max_short_term_items:]
    
    def update_working_memory(self, key: str, value: Any) -> None:
        """Update a value in working memory."""
        self.working_memory[key] = value
    
    def clear_working_memory(self) -> None:
        """Clear working memory for a new task."""
        self.working_memory = {}
    
    def add_to_long_term(self, item: Dict[str, Any]) -> None:
        """Add an item to long-term memory."""
        # Add timestamp if not present
        if "timestamp" not in item:
            item["timestamp"] = datetime.now().isoformat()
        
        self.long_term_memory.append(item)
    
    def search_long_term(self, query: str, limit: int = 5) -> List[Dict[str, Any]]:
        """Search long-term memory for relevant items.
        
        In a real implementation, this would use embeddings and vector search.
        This is a simplified version that does basic keyword matching.
        """
        results = []
        for item in self.long_term_memory:
            # Simple keyword matching (would use embeddings in practice)
            content = json.dumps(item).lower()
            if query.lower() in content:
                results.append(item)
                if len(results) >= limit:
                    break
        return results
    
    def get_relevant_context(self, query: str) -> Dict[str, Any]:
        """Get all relevant context for a query."""
        # Combine short-term and relevant long-term memory
        long_term_results = self.search_long_term(query)
        
        return {
            "short_term": self.short_term_memory,
            "working_memory": self.working_memory,
            "relevant_long_term": long_term_results
        }
    
    def summarize_short_term(self) -> str:
        """Summarize short-term memory for long-term storage.
        
        In a real implementation, this would use an LLM to generate a summary.
        This is a simplified placeholder.
        """
        # Placeholder for LLM-based summarization
        return f"Summary of {len(self.short_term_memory)} recent interactions"
    
    def commit_short_term_to_long_term(self) -> None:
        """Summarize short-term memory and commit to long-term memory."""
        if not self.short_term_memory:
            return
        
        summary = self.summarize_short_term()
        self.add_to_long_term({
            "type": "conversation_summary",
            "summary": summary,
            "original_items": self.short_term_memory.copy()
        })

# Example usage
memory = MemorySystem()

# Add user interaction to short-term memory
memory.add_to_short_term({
    "type": "user_message",
    "content": "I need to find information about climate change impacts."
})

# Update working memory with current task
memory.update_working_memory("current_task", "research_climate_change")
memory.update_working_memory("search_queries_used", ["climate change impacts", "sea level rise"])

# Add some knowledge to long-term memory
memory.add_to_long_term({
    "type": "learned_fact",
    "topic": "climate_change",
    "fact": "Global sea levels rose about 8-9 inches since 1880."
})

# Get relevant context for a query
context = memory.get_relevant_context("climate change sea level")
print(json.dumps(context, indent=2))

# At the end of a session, commit short-term to long-term
memory.commit_short_term_to_long_term()

4. Planning and Execution

Planning and execution systems enable agents to break down complex tasks into manageable steps and carry them out effectively:

Goal Decomposition: Breaking high-level goals into subgoals
Step Sequencing: Determining the optimal order of operations
Progress Tracking: Monitoring completion of steps
Error Handling: Adapting to failures and unexpected outcomes

from typing import List, Dict, Any, Optional
from enum import Enum
import uuid

class TaskStatus(Enum):
    PENDING = "pending"
    IN_PROGRESS = "in_progress"
    COMPLETED = "completed"
    FAILED = "failed"

class Task:
    def __init__(self, description: str, dependencies: List[str] = None):
        self.id = str(uuid.uuid4())[:8]
        self.description = description
        self.dependencies = dependencies or []
        self.status = TaskStatus.PENDING
        self.result = None
        self.error = None
    
    def to_dict(self) -> Dict[str, Any]:
        return {
            "id": self.id,
            "description": self.description,
            "dependencies": self.dependencies,
            "status": self.status.value,
            "result": self.result,
            "error": self.error
        }

class Plan:
    def __init__(self, goal: str):
        self.id = str(uuid.uuid4())[:8]
        self.goal = goal
        self.tasks = {}  # id -> Task
        self.created_at = datetime.now().isoformat()
    
    def add_task(self, description: str, dependencies: List[str] = None) -> str:
        """Add a task to the plan and return its ID."""
        task = Task(description, dependencies)
        self.tasks[task.id] = task
        return task.id
    
    def get_next_tasks(self) -> List[Task]:
        """Get tasks that are ready to be executed (all dependencies satisfied)."""
        next_tasks = []
        for task in self.tasks.values():
            if task.status == TaskStatus.PENDING:
                dependencies_met = True
                for dep_id in task.dependencies:
                    if dep_id not in self.tasks or self.tasks[dep_id].status != TaskStatus.COMPLETED:
                        dependencies_met = False
                        break
                
                if dependencies_met:
                    next_tasks.append(task)
        
        return next_tasks
    
    def update_task(self, task_id: str, status: TaskStatus, result: Any = None, error: str = None) -> None:
        """Update the status and result of a task."""
        if task_id not in self.tasks:
            raise ValueError(f"Task {task_id} not found in plan")
        
        task = self.tasks[task_id]
        task.status = status
        task.result = result
        task.error = error
    
    def is_completed(self) -> bool:
        """Check if all tasks in the plan are completed."""
        return all(task.status == TaskStatus.COMPLETED for task in self.tasks.values())
    
    def has_failed_tasks(self) -> bool:
        """Check if any tasks have failed."""
        return any(task.status == TaskStatus.FAILED for task in self.tasks.values())
    
    def get_summary(self) -> Dict[str, Any]:
        """Get a summary of the plan's status."""
        total = len(self.tasks)
        completed = sum(1 for task in self.tasks.values() if task.status == TaskStatus.COMPLETED)
        in_progress = sum(1 for task in self.tasks.values() if task.status == TaskStatus.IN_PROGRESS)
        pending = sum(1 for task in self.tasks.values() if task.status == TaskStatus.PENDING)
        failed = sum(1 for task in self.tasks.values() if task.status == TaskStatus.FAILED)
        
        return {
            "id": self.id,
            "goal": self.goal,
            "total_tasks": total,
            "completed": completed,
            "in_progress": in_progress,
            "pending": pending,
            "failed": failed,
            "is_completed": self.is_completed(),
            "has_failed": self.has_failed_tasks()
        }
    
    def to_dict(self) -> Dict[str, Any]:
        """Convert the plan to a dictionary."""
        return {
            "id": self.id,
            "goal": self.goal,
            "tasks": {task_id: task.to_dict() for task_id, task in self.tasks.items()},
            "created_at": self.created_at,
            "summary": self.get_summary()
        }

class PlanningSystem:
    def __init__(self, llm):
        self.llm = llm
        self.plans = {}  # plan_id -> Plan
    
    def create_plan(self, goal: str) -> str:
        """Create a new plan for a goal and return its ID."""
        # Use LLM to break down the goal into tasks
        planning_prompt = f"""
        Create a step-by-step plan to achieve this goal: {goal}
        
        For each step, consider:
        1. What needs to be done
        2. What information is needed
        3. What dependencies exist (which steps must be completed first)
        
        Format your response as a numbered list of steps.
        """
        
        plan_text = self.llm.predict(planning_prompt)
        
        # Create a new plan
        plan = Plan(goal)
        
        # Parse the plan text and add tasks
        # This is a simplified parser that assumes a numbered list
        lines = plan_text.strip().split('\n')
        task_map = {}  # step number -> task_id
        
        for line in lines:
            line = line.strip()
            if not line:
                continue
            
            # Try to extract step number
            parts = line.split('.', 1)
            if len(parts) == 2 and parts[0].strip().isdigit():
                step_num = int(parts[0].strip())
                description = parts[1].strip()
                
                # Assume dependencies are previous steps
                dependencies = [task_map[i] for i in range(1, step_num) if i in task_map]
                
                task_id = plan.add_task(description, dependencies)
                task_map[step_num] = task_id
        
        # Store the plan
        self.plans[plan.id] = plan
        return plan.id
    
    def execute_plan(self, plan_id: str, tool_framework) -> Dict[str, Any]:
        """Execute a plan using the provided tool framework."""
        if plan_id not in self.plans:
            return {"error": f"Plan {plan_id} not found"}
        
        plan = self.plans[plan_id]
        
        # Continue executing until plan is completed or all available tasks are attempted
        while not plan.is_completed() and not plan.has_failed_tasks():
            next_tasks = plan.get_next_tasks()
            if not next_tasks:
                break
            
            # Execute each ready task
            for task in next_tasks:
                # Update task status
                plan.update_task(task.id, TaskStatus.IN_PROGRESS)
                
                # Determine which tool to use and how to use it
                tool_selection_prompt = f"""
                Task: {task.description}
                
                Available tools:
                {tool_framework.get_tool_descriptions()}
                
                Which tool should be used for this task? If no tool is needed, respond with "NONE".
                If a tool is needed, specify the tool name and the parameters in this format:
                TOOL: [tool_name]
                PARAMS: [JSON formatted parameters]
                """
                
                tool_response = self.llm.predict(tool_selection_prompt)
                
                # Parse the tool response
                tool_name = None
                tool_params = {}
                
                if "TOOL:" in tool_response:
                    tool_lines = tool_response.split('\n')
                    for line in tool_lines:
                        if line.startswith("TOOL:"):
                            tool_name = line.replace("TOOL:", "").strip()
                        elif line.startswith("PARAMS:"):
                            params_text = line.replace("PARAMS:", "").strip()
                            try:
                                tool_params = json.loads(params_text)
                            except:
                                # If JSON parsing fails, use an empty dict
                                tool_params = {}
                
                # Execute the tool if specified
                if tool_name and tool_name != "NONE":
                    try:
                        result = tool_framework.execute_tool(tool_name, **tool_params)
                        plan.update_task(task.id, TaskStatus.COMPLETED, result=result)
                    except Exception as e:
                        plan.update_task(task.id, TaskStatus.FAILED, error=str(e))
                else:
                    # No tool needed, mark as completed
                    plan.update_task(task.id, TaskStatus.COMPLETED)
        
        return plan.to_dict()
    
    def get_plan(self, plan_id: str) -> Optional[Dict[str, Any]]:
        """Get a plan by ID."""
        if plan_id not in self.plans:
            return None
        return self.plans[plan_id].to_dict()

# Example usage (assuming llm and tool_framework are defined)
# planning_system = PlanningSystem(llm)
# plan_id = planning_system.create_plan("Research the impact of climate change on agriculture")
# plan_result = planning_system.execute_plan(plan_id, tool_framework)
# print(json.dumps(plan_result, indent=2))

5. User Interaction Interface

The user interaction interface enables communication between the agent and its users:

Input Processing: Interpreting user requests and commands
Output Generation: Creating clear, helpful responses
Clarification Mechanisms: Asking for additional information when needed
Progress Updates: Keeping users informed about task status

from typing import List, Dict, Any, Optional, Callable
import json

class UserInteractionInterface:
    def __init__(self, llm, memory_system):
        self.llm = llm
        self.memory_system = memory_system
        self.response_formatters = {
            "text": self._format_text_response,
            "search_results": self._format_search_results,
            "error": self._format_error_response,
            "plan": self._format_plan_response,
            "clarification": self._format_clarification_request
        }
    
    def process_user_input(self, user_input: str) -> Dict[str, Any]:
        """Process user input and store in memory."""
        # Store in memory
        self.memory_system.add_to_short_term({
            "type": "user_input",
            "content": user_input
        })
        
        # Analyze the input
        analysis_prompt = f"""
        Analyze this user input: "{user_input}"
        
        Determine:
        1. The primary intent (question, command, clarification, etc.)
        2. The main topic or subject
        3. Any specific constraints or preferences mentioned
        4. Whether additional information is needed from the user
        
        Format your response as JSON with these fields:
        {{
            "intent": "string",
            "topic": "string",
            "constraints": ["string"],
            "needs_clarification": boolean,
            "clarification_question": "string" (if needs_clarification is true)
        }}
        """
        
        analysis_response = self.llm.predict(analysis_prompt)
        
        # Parse the JSON response
        try:
            analysis = json.loads(analysis_response)
        except:
            # Fallback if JSON parsing fails
            analysis = {
                "intent": "unknown",
                "topic": "unknown",
                "constraints": [],
                "needs_clarification": False
            }
        
        # Store analysis in working memory
        self.memory_system.update_working_memory("current_intent", analysis["intent"])
        self.memory_system.update_working_memory("current_topic", analysis["topic"])
        
        return analysis
    
    def generate_response(self, response_type: str, content: Any) -> str:
        """Generate a formatted response based on type and content."""
        if response_type in self.response_formatters:
            formatter = self.response_formatters[response_type]
            response = formatter(content)
        else:
            # Default to text response
            response = str(content)
        
        # Store in memory
        self.memory_system.add_to_short_term({
            "type": "agent_response",
            "response_type": response_type,
            "content": response
        })
        
        return response
    
    def request_clarification(self, question: str) -> str:
        """Generate a clarification request."""
        return self.generate_response("clarification", question)
    
    def _format_text_response(self, content: str) -> str:
        """Format a simple text response."""
        return content
    
    def _format_search_results(self, results: List[Dict[str, str]]) -> str:
        """Format search results."""
        if not results:
            return "I couldn't find any relevant information."
        
        formatted = "Here's what I found:\n\n"
        for i, result in enumerate(results, 1):
            formatted += f"{i}. {result['title']}\n   {result['url']}\n"
        
        return formatted
    
    def _format_error_response(self, error: str) -> str:
        """Format an error response."""
        return f"I encountered an issue: {error}\n\nCould you try rephrasing your request or providing more information?"
    
    def _format_plan_response(self, plan: Dict[str, Any]) -> str:
        """Format a plan response."""
        formatted = f"I've created a plan to achieve your goal: {plan['goal']}\n\n"
        
        # Add summary
        summary = plan['summary']
        formatted += f"Progress: {summary['completed']}/{summary['total_tasks']} tasks completed\n\n"
        
        # Add tasks
        formatted += "Tasks:\n"
        for task_id, task in plan['tasks'].items():
            status_symbol = "✓" if task['status'] == "completed" else "⋯" if task['status'] == "in_progress" else "✗" if task['status'] == "failed" else "○"
            formatted += f"{status_symbol} {task['description']}\n"
            if task['result']:
                formatted += f"   Result: {task['result']}\n"
            if task['error']:
                formatted += f"   Error: {task['error']}\n"
        
        return formatted
    
    def _format_clarification_request(self, question: str) -> str:
        """Format a clarification request."""
        return f"To better assist you, I need some additional information:\n\n{question}"

# Example usage (assuming llm and memory_system are defined)
# interface = UserInteractionInterface(llm, memory_system)
# analysis = interface.process_user_input("Can you help me find information about renewable energy?")
# 
# if analysis["needs_clarification"]:
#     response = interface.request_clarification(analysis["clarification_question"])
# else:
#     # Simulate search results
#     search_results = [
#         {"title": "Renewable Energy Explained", "url": "https://example.com/renewable-energy"},
#         {"title": "Solar and Wind Power Basics", "url": "https://example.com/solar-wind"}
#     ]
#     response = interface.generate_response("search_results", search_results)
# 
# print(response)

Building Your First AI Agent

Let's put everything together to build a simple but functional AI agent that can help with research tasks.

Step 1: Define the Agent's Purpose and Capabilities

                Research Assistant Agent Specification:
                Purpose: Help users find, summarize, and synthesize information on specific topics
Capabilities:
                        Search for information online
Extract key points from articles and websites
Summarise findings in structured formats
Answer questions based on gathered information

                    
Tools:
                        Web search
Web browsing
Text extraction
Summarisation

                    

            

Step 2: Implement the Core Components

import os
import json
import requests
from typing import List, Dict, Any, Optional
from pydantic import BaseModel, Field
from datetime import datetime

# For simplicity, we'll use a mock LLM class
class MockLLM:
    def predict(self, prompt: str) -> str:
        """Simulate LLM prediction (in a real implementation, this would call an actual LLM API)."""
        # This is just a placeholder that returns a simple response based on the prompt
        if "search" in prompt.lower():
            return "I'll search for that information."
        elif "summarise" in prompt.lower():
            return "Here's a summary of the key points..."
        elif "analyze" in prompt.lower():
            return "Based on my analysis..."
        else:
            return "I understand your request and will help with that."

# Tool definitions
class SearchParameters(BaseModel):
    query: str = Field(..., description="The search query")
    num_results: int = Field(5, description="Number of results to return")

def search_function(query: str, num_results: int = 5) -> List[Dict[str, str]]:
    """Simulated search function."""
    # In a real implementation, this would call a search API
    return [{"title": f"Result {i} for {query}", "url": f"https://example.com/{i}"} for i in range(num_results)]

class BrowseParameters(BaseModel):
    url: str = Field(..., description="The URL to browse")

def browse_function(url: str) -> str:
    """Simulated web browsing function."""
    # In a real implementation, this would fetch and parse a webpage
    return f"Content from {url}: This is simulated webpage content for demonstration purposes."

class SummariseParameters(BaseModel):
    text: str = Field(..., description="The text to summarise")
    max_length: int = Field(200, description="Maximum length of the summary")

def summarise_function(text: str, max_length: int = 200) -> str:
    """Simulated text summarisation function."""
    # In a real implementation, this would use an LLM to summarise text
    if len(text) > max_length:
        return text[:max_length] + "..."
    return text

# Tool class
class Tool:
    def __init__(self, name: str, description: str, function, schema: BaseModel = None):
        self.name = name
        self.description = description
        self.function = function
        self.schema = schema
    
    def __call__(self, **kwargs):
        """Execute the tool with the provided arguments."""
        # Validate arguments if schema is provided
        if self.schema:
            validated_args = self.schema(**kwargs).dict()
            return self.function(**validated_args)
        return self.function(**kwargs)
    
    def get_schema(self) -> Dict[str, Any]:
        """Get the JSON schema for this tool."""
        if self.schema:
            schema_dict = self.schema.schema()
            return {
                "name": self.name,
                "description": self.description,
                "parameters": schema_dict
            }
        return {
            "name": self.name,
            "description": self.description,
            "parameters": {"type": "object", "properties": {}}
        }

# Memory system
class MemorySystem:
    def __init__(self, max_short_term_items: int = 10):
        self.short_term_memory = []  # Recent interactions
        self.working_memory = {}     # Current task state
        self.long_term_memory = []   # Persistent knowledge
        self.max_short_term_items = max_short_term_items
    
    def add_to_short_term(self, item: Dict[str, Any]) -> None:
        """Add an item to short-term memory."""
        # Add timestamp if not present
        if "timestamp" not in item:
            item["timestamp"] = datetime.now().isoformat()
        
        self.short_term_memory.append(item)
        
        # Trim if exceeding max size
        if len(self.short_term_memory) > self.max_short_term_items:
            self.short_term_memory = self.short_term_memory[-self.max_short_term_items:]
    
    def update_working_memory(self, key: str, value: Any) -> None:
        """Update a value in working memory."""
        self.working_memory[key] = value
    
    def get_short_term_memory(self) -> List[Dict[str, Any]]:
        """Get all items in short-term memory."""
        return self.short_term_memory
    
    def get_working_memory(self) -> Dict[str, Any]:
        """Get all items in working memory."""
        return self.working_memory

# Research Assistant Agent
class ResearchAssistantAgent:
    def __init__(self):
        # Initialize LLM
        self.llm = MockLLM()
        
        # Initialize memory
        self.memory = MemorySystem()
        
        # Initialize tools
        self.tools = {
            "search": Tool(
                name="search",
                description="Search the web for information",
                function=search_function,
                schema=SearchParameters
            ),
            "browse": Tool(
                name="browse",
                description="Browse a specific webpage",
                function=browse_function,
                schema=BrowseParameters
            ),
            "summarise": Tool(
                name="summarise",
                description="Summarise a piece of text",
                function=summarise_function,
                schema=SummariseParameters
            )
        }
    
    def get_tool_descriptions(self) -> str:
        """Get formatted descriptions of all available tools."""
        descriptions = []
        for name, tool in self.tools.items():
            descriptions.append(f"{name}: {tool.description}")
        return "\n".join(descriptions)
    
    def process_user_input(self, user_input: str) -> str:
        """Process user input and generate a response."""
        # Store user input in memory
        self.memory.add_to_short_term({
            "type": "user_input",
            "content": user_input
        })
        
        # Determine the appropriate action
        action_prompt = f"""
        User input: "{user_input}"
        
        Based on this input, determine what action to take.
        
        Available tools:
        {self.get_tool_descriptions()}
        
        If a tool should be used, respond in this format:
        TOOL: [tool_name]
        PARAMS: [JSON formatted parameters]
        
        If no tool is needed, respond with:
        RESPONSE: [Your direct response to the user]
        """
        
        action_decision = self.llm.predict(action_prompt)
        
        # Parse the action decision
        if "TOOL:" in action_decision:
            # Extract tool name and parameters
            tool_name = None
            tool_params = {}
            
            lines = action_decision.split('\n')
            for line in lines:
                if line.startswith("TOOL:"):
                    tool_name = line.replace("TOOL:", "").strip()
                elif line.startswith("PARAMS:"):
                    params_text = line.replace("PARAMS:", "").strip()
                    try:
                        tool_params = json.loads(params_text)
                    except:
                        # If JSON parsing fails, use an empty dict
                        tool_params = {}
            
            # Execute the tool
            if tool_name in self.tools:
                try:
                    tool_result = self.tools[tool_name](**tool_params)
                    
                    # Store tool execution in memory
                    self.memory.add_to_short_term({
                        "type": "tool_execution",
                        "tool": tool_name,
                        "params": tool_params,
                        "result": tool_result
                    })
                    
                    # Generate response based on tool result
                    response_prompt = f"""
                    User input: "{user_input}"
                    
                    Tool used: {tool_name}
                    Tool result: {tool_result}
                    
                    Generate a helpful response to the user based on this information.
                    """
                    
                    response = self.llm.predict(response_prompt)
                    
                    # Store response in memory
                    self.memory.add_to_short_term({
                        "type": "agent_response",
                        "content": response
                    })
                    
                    return response
                except Exception as e:
                    error_message = f"Error executing tool {tool_name}: {e}"
                    
                    # Store error in memory
                    self.memory.add_to_short_term({
                        "type": "error",
                        "tool": tool_name,
                        "error": str(e)
                    })
                    
                    return error_message
            else:
                return f"Tool {tool_name} not found. Available tools: {', '.join(self.tools.keys())}"
        elif "RESPONSE:" in action_decision:
            # Extract direct response
            response_lines = action_decision.split("RESPONSE:")
            if len(response_lines) > 1:
                response = response_lines[1].strip()
                
                # Store response in memory
                self.memory.add_to_short_term({
                    "type": "agent_response",
                    "content": response
                })
                
                return response
        
        # Fallback response
        fallback_response = "I'm not sure how to help with that. Could you provide more details or rephrase your request?"
        
        # Store fallback response in memory
        self.memory.add_to_short_term({
            "type": "agent_response",
            "content": fallback_response
        })
        
        return fallback_response
    
    def get_conversation_history(self) -> List[Dict[str, Any]]:
        """Get the conversation history from memory."""
        return [item for item in self.memory.get_short_term_memory() 
                if item["type"] in ["user_input", "agent_response"]]

# Example usage
agent = ResearchAssistantAgent()

# Simulate a conversation
responses = []
responses.append(agent.process_user_input("I need to research the impact of artificial intelligence on healthcare"))
responses.append(agent.process_user_input("Can you find information about AI diagnostic tools?"))
responses.append(agent.process_user_input("Summarise the key benefits of AI in healthcare"))

# Print conversation history
print("Conversation History:")
for item in agent.get_conversation_history():
    if item["type"] == "user_input":
        print(f"User: {item['content']}")
    else:
        print(f"Agent: {item['content']}")

Step 3: Enhance with Advanced Features

Once you have a basic agent working, you can enhance it with more advanced features:

                Advanced Features to Add:
                Multi-step Planning: Break complex research tasks into steps
Source Tracking: Keep track of where information came from
Fact Verification: Cross-check information across multiple sources
Personalisation: Remember user preferences and adapt accordingly
Visualisation Generation: Create charts or diagrams to illustrate findings

            

Implementation Tips

Start Simple: Begin with core functionality and add features incrementally
Test Thoroughly: Test each component individually before integration
Monitor Performance: Track response quality, tool usage, and error rates
Gather Feedback: Use real user interactions to identify improvement areas
Iterate Rapidly: Continuously refine based on testing and feedback

Common AI Agent Patterns and Anti-Patterns

Understanding common patterns and anti-patterns will help you build more effective AI agents.

Effective Patterns

                Successful AI Agent Patterns:
                
                        Pattern
                        Description
                        Benefits
                    
                        Separation of Concerns
                        Divide agent functionality into distinct components
                        Modularity, maintainability, easier testing
                    
                        Progressive Disclosure
                        Reveal capabilities and options gradually as needed
                        Reduced cognitive load, focused interactions
                    
                        Explicit Reasoning
                        Make reasoning process visible to users
                        Transparency, trust, easier debugging
                    
                        Graceful Degradation
                        Maintain functionality when optimal resources unavailable
                        Reliability, consistent user experience
                    
                        Contextual Memory
                        Maintain relevant context without overwhelming the system
                        Coherent conversations, personalized responses

Pattern	Description	Benefits
Separation of Concerns	Divide agent functionality into distinct components	Modularity, maintainability, easier testing
Progressive Disclosure	Reveal capabilities and options gradually as needed	Reduced cognitive load, focused interactions
Explicit Reasoning	Make reasoning process visible to users	Transparency, trust, easier debugging
Graceful Degradation	Maintain functionality when optimal resources unavailable	Reliability, consistent user experience
Contextual Memory	Maintain relevant context without overwhelming the system	Coherent conversations, personalized responses

Anti-Patterns to Avoid

AI Agent Anti-Patterns:

Anti-Pattern	Description	Consequences
Monolithic Design	Building the entire agent as a single, tightly-coupled system	Difficult to maintain, test, or extend
Capability Overload	Adding too many features without clear organization	User confusion, diluted effectiveness
Excessive Autonomy	Giving agents too much freedom without appropriate guardrails	Unpredictable behavior, potential harmful actions
Context Flooding	Providing too much context to the LLM	Token waste, diluted focus, increased costs
Tool Proliferation	Adding too many similar or overlapping tools	Decision paralysis, inefficient tool selection

Next Steps in Your AI Journey

Now that you understand the fundamentals of AI agents, you're ready to explore more advanced agent design patterns and architectures.

                Key Takeaways from This Section:
                AI agents combine LLMs with tools, memory, and planning to create autonomous systems
The agent cognition loop (Observe, Orient, Decide, Act, Learn) drives agent behaviour
Different types of agents serve different purposes, from assistive to fully autonomous
Core components include the LLM, tools, memory, and planning module
Benefits include automation and efficiency, while challenges include reliability, control, and security
Ethical development requires transparency, accountability, bias mitigation, and user control

            

In the next section, we delve into specific Agentic Design Patterns, providing blueprints for constructing robust and effective AI agents for various applications.

Continue to Agentic Design Patterns →