Step 8: Test & Optimize
Now that you've built your value investing AI agent, it's time to test and optimize it to ensure it provides accurate, reliable, and useful investment recommendations. Thorough testing and optimization are crucial steps that can make the difference between a mediocre tool and a truly valuable investment assistant.
Why Testing and Optimization Matter
Testing and optimization serve several important purposes for your value investing AI agent:
Key Testing Objectives
- Verify accuracy of financial data retrieval and calculations
- Validate investment recommendations against established value investing principles
- Ensure the agent performs well across different market sectors and conditions
- Identify and fix bugs, edge cases, and performance issues
Data Accuracy
Your agent's recommendations are only as good as the data it works with. Testing ensures that:
- Financial data is correctly retrieved from APIs and databases
- Calculations (ratios, scores, etc.) are mathematically correct
- Missing or anomalous data is handled appropriately
- Data updates are timely and reflect current market conditions
Recommendation Validity
Your agent should provide recommendations that align with value investing principles:
- Companies with strong fundamentals and low valuations should receive positive ratings
- Overvalued companies should receive appropriate caution signals
- Recommendations should be consistent with the criteria you defined in Step 1
- Explanations should be logical and help users understand the reasoning
Robustness Across Markets
Your agent should work well across different types of companies and market conditions:
- Different sectors (tech, finance, healthcare, etc.) have different typical financial profiles
- Companies of different sizes (small-cap, mid-cap, large-cap) should be analyzed appropriately
- The agent should adapt to different market cycles (bull markets, bear markets, etc.)
- International stocks may require different considerations than domestic ones
Technical Performance
Your agent should function reliably and efficiently:
- Response times should be reasonable, even when analyzing multiple stocks
- Error handling should be robust and provide useful feedback
- Edge cases (e.g., IPOs with limited history, companies with unusual financials) should be handled gracefully
- Resource usage (memory, CPU, API calls) should be optimized
Testing Strategies
Let's explore different strategies for testing your value investing AI agent:
# test_value_agent.py
import unittest
import pandas as pd
import numpy as np
from unittest.mock import patch, MagicMock
# Import your agent class
# In a real test, you would import your actual agent class
# For this example, we'll define a simplified version
class SimpleValueInvestingAgent:
def __init__(self):
self.criteria = {
'pe_ratio': {'max': 15, 'weight': 0.15, 'better': 'lower'},
'pb_ratio': {'max': 3, 'weight': 0.15, 'better': 'lower'},
'roe': {'min': 0.15, 'weight': 0.15, 'better': 'higher'},
'debt_to_equity': {'max': 1.0, 'weight': 0.1, 'better': 'lower'},
'fcf_yield': {'min': 0.02, 'weight': 0.15, 'better': 'higher'},
'dividend_yield': {'min': 0.01, 'weight': 0.1, 'better': 'higher'},
'earnings_growth': {'min': 0.05, 'weight': 0.1, 'better': 'higher'},
'margin_of_safety': {'min': 0.2, 'weight': 0.1, 'better': 'higher'}
}
def fetch_data(self, ticker):
# This would normally call an API
pass
def analyze(self, financial_data):
# Simplified analysis logic
if not financial_data:
return None
results = {
'company_name': financial_data.get('name', 'Unknown Company'),
'ticker': financial_data.get('ticker', 'Unknown'),
'total_score': 0,
'max_possible_score': sum(criterion['weight'] for criterion in self.criteria.values()),
'metric_scores': {},
'explanations': []
}
# Calculate scores for each metric
for metric_name, criterion in self.criteria.items():
if metric_name in financial_data and not pd.isna(financial_data[metric_name]):
value = financial_data[metric_name]
score = 0
if criterion['better'] == 'lower' and 'max' in criterion:
if value <= criterion['max']:
score = criterion['weight'] * (1 - value / criterion['max'])
if score < 0:
score = 0
elif criterion['better'] == 'higher' and 'min' in criterion:
if value >= criterion['min']:
score = criterion['weight'] * min(1, (value - criterion['min']) / (criterion['min'] * 2))
results['metric_scores'][metric_name] = score
results['total_score'] += score
# Calculate percentage score
if results['max_possible_score'] > 0:
results['percentage_score'] = (results['total_score'] / results['max_possible_score']) * 100
else:
results['percentage_score'] = 0
# Generate recommendation
score = results['percentage_score']
if score >= 70:
results['rating'] = "Strong Buy"
elif score >= 60:
results['rating'] = "Buy"
elif score >= 40:
results['rating'] = "Hold"
elif score >= 30:
results['rating'] = "Sell"
else:
results['rating'] = "Strong Sell"
return results
class TestValueInvestingAgent(unittest.TestCase):
"""Test cases for the Value Investing Agent."""
def setUp(self):
"""Set up test fixtures."""
self.agent = SimpleValueInvestingAgent()
# Sample test data for a value stock
self.value_stock_data = {
'ticker': 'VALUE',
'name': 'Value Company',
'pe_ratio': 10.0,
'pb_ratio': 1.5,
'roe': 0.20,
'debt_to_equity': 0.5,
'fcf_yield': 0.05,
'dividend_yield': 0.03,
'earnings_growth': 0.08,
'margin_of_safety': 0.25
}
# Sample test data for an overvalued stock
self.overvalued_stock_data = {
'ticker': 'OVER',
'name': 'Overvalued Company',
'pe_ratio': 50.0,
'pb_ratio': 10.0,
'roe': 0.10,
'debt_to_equity': 2.0,
'fcf_yield': 0.01,
'dividend_yield': 0.005,
'earnings_growth': 0.03,
'margin_of_safety': 0.05
}
# Sample test data with missing values
self.incomplete_stock_data = {
'ticker': 'INCOMPLETE',
'name': 'Incomplete Data Company',
'pe_ratio': 12.0,
'pb_ratio': np.nan,
'roe': 0.18,
'debt_to_equity': np.nan,
'fcf_yield': 0.03,
'dividend_yield': np.nan,
'earnings_growth': np.nan,
'margin_of_safety': 0.15
}
def test_analyze_value_stock(self):
"""Test that a value stock receives a positive rating."""
result = self.agent.analyze(self.value_stock_data)
# Check that the analysis was performed
self.assertIsNotNone(result)
# Check that the score is high (should be a "Buy" or "Strong Buy")
self.assertGreaterEqual(result['percentage_score'], 60)
self.assertIn(result['rating'], ["Buy", "Strong Buy"])
# Check individual metric scores
self.assertGreater(result['metric_scores']['pe_ratio'], 0)
self.assertGreater(result['metric_scores']['roe'], 0)
def test_analyze_overvalued_stock(self):
"""Test that an overvalued stock receives a negative rating."""
result = self.agent.analyze(self.overvalued_stock_data)
# Check that the analysis was performed
self.assertIsNotNone(result)
# Check that the score is low (should be a "Sell" or "Strong Sell")
self.assertLessEqual(result['percentage_score'], 40)
self.assertIn(result['rating'], ["Sell", "Strong Sell"])
# Check individual metric scores
self.assertEqual(result['metric_scores'].get('pe_ratio', 0), 0)
self.assertEqual(result['metric_scores'].get('pb_ratio', 0), 0)
def test_analyze_incomplete_data(self):
"""Test that the agent handles incomplete data gracefully."""
result = self.agent.analyze(self.incomplete_stock_data)
# Check that the analysis was performed despite missing data
self.assertIsNotNone(result)
# Check that only available metrics were scored
self.assertIn('pe_ratio', result['metric_scores'])
self.assertIn('roe', result['metric_scores'])
self.assertIn('fcf_yield', result['metric_scores'])
self.assertNotIn('pb_ratio', result['metric_scores'])
self.assertNotIn('debt_to_equity', result['metric_scores'])
def test_analyze_empty_data(self):
"""Test that the agent handles empty data gracefully."""
result = self.agent.analyze({})
# Check that the analysis returns None for empty data
self.assertIsNone(result)
def test_analyze_none_data(self):
"""Test that the agent handles None data gracefully."""
result = self.agent.analyze(None)
# Check that the analysis returns None for None data
self.assertIsNone(result)
@patch('SimpleValueInvestingAgent.fetch_data')
def test_fetch_data(self, mock_fetch):
"""Test that the agent fetches data correctly."""
# Mock the fetch_data method to return our test data
mock_fetch.return_value = self.value_stock_data
# Call the method
data = self.agent.fetch_data('VALUE')
# Check that the data was fetched
self.assertEqual(data, self.value_stock_data)
# Check that the method was called with the correct ticker
mock_fetch.assert_called_once_with('VALUE')
if __name__ == '__main__':
unittest.main()
Backtesting with Historical Data
One of the most important ways to test a value investing agent is to see how it would have performed in the past. This is called backtesting:
# backtest_value_agent.py
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import yfinance as yf
from datetime import datetime, timedelta
import os
# Import your agent class
# In a real test, you would import your actual agent class
# For this example, we'll define a simplified version
class SimpleValueInvestingAgent:
def __init__(self):
self.criteria = {
'pe_ratio': {'max': 15, 'weight': 0.15, 'better': 'lower'},
'pb_ratio': {'max': 3, 'weight': 0.15, 'better': 'lower'},
'roe': {'min': 0.15, 'weight': 0.15, 'better': 'higher'},
'debt_to_equity': {'max': 1.0, 'weight': 0.1, 'better': 'lower'},
'fcf_yield': {'min': 0.02, 'weight': 0.15, 'better': 'higher'},
'dividend_yield': {'min': 0.01, 'weight': 0.1, 'better': 'higher'},
'earnings_growth': {'min': 0.05, 'weight': 0.1, 'better': 'higher'},
'margin_of_safety': {'min': 0.2, 'weight': 0.1, 'better': 'higher'}
}
def analyze_historical(self, financial_data):
"""Analyze historical financial data."""
if not financial_data:
return None
results = {
'company_name': financial_data.get('name', 'Unknown Company'),
'ticker': financial_data.get('ticker', 'Unknown'),
'date': financial_data.get('date', 'Unknown'),
'total_score': 0,
'max_possible_score': sum(criterion['weight'] for criterion in self.criteria.values()),
'metric_scores': {},
}
# Calculate scores for each metric
for metric_name, criterion in self.criteria.items():
if metric_name in financial_data and not pd.isna(financial_data[metric_name]):
value = financial_data[metric_name]
score = 0
if criterion['better'] == 'lower' and 'max' in criterion:
if value <= criterion['max']:
score = criterion['weight'] * (1 - value / criterion['max'])
if score < 0:
score = 0
elif criterion['better'] == 'higher' and 'min' in criterion:
if value >= criterion['min']:
score = criterion['weight'] * min(1, (value - criterion['min']) / (criterion['min'] * 2))
results['metric_scores'][metric_name] = score
results['total_score'] += score
# Calculate percentage score
if results['max_possible_score'] > 0:
results['percentage_score'] = (results['total_score'] / results['max_possible_score']) * 100
else:
results['percentage_score'] = 0
# Generate recommendation
score = results['percentage_score']
if score >= 70:
results['rating'] = "Strong Buy"
elif score >= 60:
results['rating'] = "Buy"
elif score >= 40:
results['rating'] = "Hold"
elif score >= 30:
results['rating'] = "Sell"
else:
results['rating'] = "Strong Sell"]
return results
class BacktestEngine:
"""Engine for backtesting a value investing agent."""
def __init__(self, agent, start_date, end_date, rebalance_period='quarterly'):
"""
Initialize the backtest engine.
Parameters:
-----------
agent : ValueInvestingAgent
The value investing agent to backtest
start_date : str
Start date for the backtest (format: 'YYYY-MM-DD')
end_date : str
End date for the backtest (format: 'YYYY-MM-DD')
rebalance_period : str, optional
How often to rebalance the portfolio ('monthly', 'quarterly', 'annually')
"""
self.agent = agent
self.start_date = start_date
self.end_date = end_date
self.rebalance_period = rebalance_period
# Define rebalance frequency in months
if rebalance_period == 'monthly':
self.rebalance_months = 1
elif rebalance_period == 'quarterly':
self.rebalance_months = 3
elif rebalance_period == 'annually':
self.rebalance_months = 12
else:
raise ValueError("rebalance_period must be 'monthly', 'quarterly', or 'annually'")
def get_historical_financial_data(self, ticker, date):
"""
Get historical financial data for a ticker at a specific date.
In a real implementation, this would fetch historical financial statements.
For this example, we'll use a simplified approach with random data.
"""
# This is a placeholder. In a real implementation, you would:
# 1. Fetch historical financial statements from a database or API
# 2. Calculate the financial metrics as they were at that point in time
# For demonstration, we'll generate synthetic data
np.random.seed(int(datetime.strptime(date, '%Y-%m-%d').timestamp()))
# Base values that change over time
base_pe = 15 + np.random.normal(0, 3)
base_pb = 2 + np.random.normal(0, 0.5)
base_roe = 0.15 + np.random.normal(0, 0.03)
base_de = 0.8 + np.random.normal(0, 0.2)
base_fcf = 0.03 + np.random.normal(0, 0.01)
base_div = 0.02 + np.random.normal(0, 0.005)
base_growth = 0.06 + np.random.normal(0, 0.02)
# Get stock price at that date
try:
stock = yf.Ticker(ticker)
hist = stock.history(start=date, end=(datetime.strptime(date, '%Y-%m-%d') + timedelta(days=5)).strftime('%Y-%m-%d'))
if not hist.empty:
price = hist.iloc[0]['Close']
high_52w = stock.history(period='1y', end=date)['High'].max()
margin_of_safety = (high_52w - price) / high_52w
else:
price = 100
margin_of_safety = 0.1
except:
price = 100
margin_of_safety = 0.1
return {
'ticker': ticker,
'name': f"{ticker} Inc.",
'date': date,
'price': price,
'pe_ratio': base_pe,
'pb_ratio': base_pb,
'roe': base_roe,
'debt_to_equity': base_de,
'fcf_yield': base_fcf,
'dividend_yield': base_div,
'earnings_growth': base_growth,
'margin_of_safety': margin_of_safety
}
def get_stock_returns(self, ticker, start_date, end_date):
"""Get stock returns between two dates."""
try:
stock = yf.Ticker(ticker)
hist = stock.history(start=start_date, end=end_date)
if hist.empty:
return 0
start_price = hist.iloc[0]['Close']
end_price = hist.iloc[-1]['Close']
# Calculate total return including dividends
total_return = (end_price / start_price) - 1
# Add dividend returns
dividends = hist['Dividends'].sum()
if dividends > 0:
dividend_return = dividends / start_price
total_return += dividend_return
return total_return
except Exception as e:
print(f"Error getting returns for {ticker}: {e}")
return 0
def generate_rebalance_dates(self):
"""Generate dates for portfolio rebalancing."""
start = datetime.strptime(self.start_date, '%Y-%m-%d')
end = datetime.strptime(self.end_date, '%Y-%m-%d')
dates = []
current = start
while current <= end:
dates.append(current.strftime('%Y-%m-%d'))
# Move to next rebalance date
year = current.year + ((current.month - 1 + self.rebalance_months) // 12)
month = ((current.month - 1 + self.rebalance_months) % 12) + 1
current = datetime(year, month, min(current.day, 28))
return dates
def run_backtest(self, universe, top_n=5):
"""
Run the backtest.
Parameters:
-----------
universe : list
List of ticker symbols to consider for the portfolio
top_n : int, optional
Number of top-rated stocks to include in the portfolio
Returns:
--------
dict
Backtest results
"""
# Generate rebalance dates
rebalance_dates = self.generate_rebalance_dates()
# Initialize results
portfolio_values = [1.0] # Start with $1
benchmark_values = [1.0] # Start with $1
dates = [self.start_date]
holdings = []
# Run the backtest
for i in range(len(rebalance_dates) - 1):
current_date = rebalance_dates[i]
next_date = rebalance_dates[i + 1]
print(f"Analyzing period: {current_date} to {next_date}")
# Analyze each stock in the universe
stock_analyses = []
for ticker in universe:
financial_data = self.get_historical_financial_data(ticker, current_date)
analysis = self.agent.analyze_historical(financial_data)
if analysis:
stock_analyses.append(analysis)
# Sort by value score
stock_analyses.sort(key=lambda x: x['percentage_score'], reverse=True)
# Select top N stocks
selected_stocks = stock_analyses[:top_n]
# Record holdings
holdings.append({
'date': current_date,
'stocks': [{'ticker': s['ticker'], 'rating': s['rating'], 'score': s['percentage_score']} for s in selected_stocks]
})
# Calculate returns for the period
portfolio_return = 0
for stock in selected_stocks:
stock_return = self.get_stock_returns(stock['ticker'], current_date, next_date)
portfolio_return += stock_return / len(selected_stocks) # Equal weighting
# Calculate benchmark return (S&P 500)
benchmark_return = self.get_stock_returns('SPY', current_date, next_date)
# Update portfolio and benchmark values
portfolio_values.append(portfolio_values[-1] * (1 + portfolio_return))
benchmark_values.append(benchmark_values[-1] * (1 + benchmark_return))
dates.append(next_date)
print(f"Period return: Portfolio: {portfolio_return:.2%}, Benchmark: {benchmark_return:.2%}")
# Calculate performance metrics
total_portfolio_return = portfolio_values[-1] - 1
total_benchmark_return = benchmark_values[-1] - 1
# Calculate annualized returns
years = (datetime.strptime(self.end_date, '%Y-%m-%d') - datetime.strptime(self.start_date, '%Y-%m-%d')).days / 365.25
annualized_portfolio_return = (1 + total_portfolio_return) ** (1 / years) - 1
annualized_benchmark_return = (1 + total_benchmark_return) ** (1 / years) - 1
# Calculate excess return
excess_return = annualized_portfolio_return - annualized_benchmark_return
# Calculate drawdowns
portfolio_drawdowns = []
benchmark_drawdowns = []
portfolio_peak = portfolio_values[0]
benchmark_peak = benchmark_values[0]
for i in range(len(portfolio_values)):
if portfolio_values[i] > portfolio_peak:
portfolio_peak = portfolio_values[i]
if benchmark_values[i] > benchmark_peak:
benchmark_peak = benchmark_values[i]
portfolio_drawdown = (portfolio_values[i] - portfolio_peak) / portfolio_peak
benchmark_drawdown = (benchmark_values[i] - benchmark_peak) / benchmark_peak
portfolio_drawdowns.append(portfolio_drawdown)
benchmark_drawdowns.append(benchmark_drawdown)
max_portfolio_drawdown = min(portfolio_drawdowns)
max_benchmark_drawdown = min(benchmark_drawdowns)
# Compile results
results = {
'dates': dates,
'portfolio_values': portfolio_values,
'benchmark_values': benchmark_values,
'holdings': holdings,
'total_portfolio_return': total_portfolio_return,
'total_benchmark_return': total_benchmark_return,
'annualized_portfolio_return': annualized_portfolio_return,
'annualized_benchmark_return': annualized_benchmark_return,
'excess_return': excess_return,
'max_portfolio_drawdown': max_portfolio_drawdown,
'max_benchmark_drawdown': max_benchmark_drawdown
}
return results
def plot_results(self, results):
"""Plot backtest results."""
plt.figure(figsize=(12, 8))
# Plot portfolio vs benchmark
plt.subplot(2, 1, 1)
plt.plot(results['dates'], results['portfolio_values'], label='Value Portfolio')
plt.plot(results['dates'], results['benchmark_values'], label='S&P 500')
plt.title('Portfolio Performance')
plt.xlabel('Date')
plt.ylabel('Value ($)')
plt.legend()
plt.grid(True)
# Add performance metrics as text
plt.figtext(0.15, 0.85, f"Total Return: {results['total_portfolio_return']:.2%} vs {results['total_benchmark_return']:.2%} (S&P 500)", fontsize=12)
plt.figtext(0.15, 0.82, f"Annualized Return: {results['annualized_portfolio_return']:.2%} vs {results['annualized_benchmark_return']:.2%} (S&P 500)", fontsize=12)
plt.figtext(0.15, 0.79, f"Excess Return: {results['excess_return']:.2%}", fontsize=12)
plt.figtext(0.15, 0.76, f"Max Drawdown: {results['max_portfolio_drawdown']:.2%} vs {results['max_benchmark_drawdown']:.2%} (S&P 500)", fontsize=12)
# Plot holdings over time
plt.subplot(2, 1, 2)
# Extract holdings data
holding_dates = [h['date'] for h in results['holdings']]
tickers = set()
for h in results['holdings']:
for s in h['stocks']:
tickers.add(s['ticker'])
# Create a matrix of holdings
tickers = sorted(list(tickers))
holdings_matrix = np.zeros((len(holding_dates), len(tickers)))
for i, h in enumerate(results['holdings']):
for s in h['stocks']:
if s['ticker'] in tickers:
j = tickers.index(s['ticker'])
holdings_matrix[i, j] = 1
plt.imshow(holdings_matrix, aspect='auto', cmap='Blues')
plt.yticks(range(len(holding_dates)), holding_dates)
plt.xticks(range(len(tickers)), tickers, rotation=90)
plt.title('Portfolio Holdings Over Time')
plt.xlabel('Stock')
plt.ylabel('Rebalance Date')
plt.colorbar(label='Holding Weight')
plt.tight_layout()
plt.savefig('backtest_results.png')
plt.close()
print("Backtest results plot saved as 'backtest_results.png'")
# Example usage
if __name__ == "__main__":
# Create the agent
agent = SimpleValueInvestingAgent()
# Create the backtest engine
backtest = BacktestEngine(
agent=agent,
start_date='2018-01-01',
end_date='2023-01-01',
rebalance_period='quarterly'
)
# Define universe of stocks to consider
universe = ['AAPL', 'MSFT', 'GOOGL', 'AMZN', 'META', 'BRK-B', 'JNJ', 'JPM', 'V', 'PG',
'UNH', 'HD', 'BAC', 'XOM', 'NVDA', 'DIS', 'ADBE', 'CRM', 'NFLX', 'CSCO']
# Run the backtest
results = backtest.run_backtest(universe=universe, top_n=5)
# Plot the results
backtest.plot_results(results)
# Print summary
print("\nBacktest Summary:")
print(f"Period: {backtest.start_date} to {backtest.end_date}")
print(f"Rebalance Frequency: {backtest.rebalance_period}")
print(f"Total Return: {results['total_portfolio_return']:.2%} vs {results['total_benchmark_return']:.2%} (S&P 500)")
print(f"Annualized Return: {results['annualized_portfolio_return']:.2%} vs {results['annualized_benchmark_return']:.2%} (S&P 500)")
print(f"Excess Return: {results['excess_return']:.2%}")
print(f"Max Drawdown: {results['max_portfolio_drawdown']:.2%} vs {results['max_benchmark_drawdown']:.2%} (S&P 500)")
Optimizing Your Agent
Based on your testing results, you'll likely want to optimize your value investing agent. Here are some approaches to optimization:
Optimization Strategies
Consider these strategies for improving your value investing AI agent:
1. Criteria Tuning
Adjust your value investing criteria based on backtest results:
# Example of criteria tuning
def tune_criteria(agent, universe, start_date, end_date):
"""Tune criteria weights to optimize performance."""
best_return = -float('inf')
best_weights = None
# Define weight combinations to test
weight_options = [0.05, 0.1, 0.15, 0.2, 0.25]
# Test different weight combinations
for pe_weight in weight_options:
for pb_weight in weight_options:
for roe_weight in weight_options:
# Ensure weights sum to 1.0
remaining_weight = 1.0 - (pe_weight + pb_weight + roe_weight)
if remaining_weight <= 0:
continue
# Update agent criteria weights
agent.criteria['pe_ratio']['weight'] = pe_weight
agent.criteria['pb_ratio']['weight'] = pb_weight
agent.criteria['roe']['weight'] = roe_weight
agent.criteria['debt_to_equity']['weight'] = remaining_weight / 5
agent.criteria['fcf_yield']['weight'] = remaining_weight / 5
agent.criteria['dividend_yield']['weight'] = remaining_weight / 5
agent.criteria['earnings_growth']['weight'] = remaining_weight / 5
agent.criteria['margin_of_safety']['weight'] = remaining_weight / 5
# Run backtest with these weights
backtest = BacktestEngine(agent, start_date, end_date)
results = backtest.run_backtest(universe)
# Check if this is the best performance so far
if results['excess_return'] > best_return:
best_return = results['excess_return']
best_weights = {
'pe_ratio': pe_weight,
'pb_ratio': pb_weight,
'roe': roe_weight,
'debt_to_equity': remaining_weight / 5,
'fcf_yield': remaining_weight / 5,
'dividend_yield': remaining_weight / 5,
'earnings_growth': remaining_weight / 5,
'margin_of_safety': remaining_weight / 5
}
return best_weights, best_return
2. Threshold Optimization
Fine-tune the thresholds for each criterion:
# Example of threshold optimization
def optimize_thresholds(agent, universe, start_date, end_date):
"""Optimize criteria thresholds."""
best_return = -float('inf')
best_thresholds = None
# Define threshold options to test
pe_options = [10, 15, 20, 25]
pb_options = [1, 2, 3, 4]
roe_options = [0.1, 0.15, 0.2, 0.25]
# Test different threshold combinations
for pe_max in pe_options:
for pb_max in pb_options:
for roe_min in roe_options:
# Update agent criteria thresholds
agent.criteria['pe_ratio']['max'] = pe_max
agent.criteria['pb_ratio']['max'] = pb_max
agent.criteria['roe']['min'] = roe_min
# Run backtest with these thresholds
backtest = BacktestEngine(agent, start_date, end_date)
results = backtest.run_backtest(universe)
# Check if this is the best performance so far
if results['excess_return'] > best_return:
best_return = results['excess_return']
best_thresholds = {
'pe_ratio_max': pe_max,
'pb_ratio_max': pb_max,
'roe_min': roe_min
}
return best_thresholds, best_return
3. Sector-Specific Adjustments
Customize criteria for different market sectors:
# Example of sector-specific criteria
sector_criteria = {
'Technology': {
'pe_ratio': {'max': 25, 'weight': 0.15}, # Higher P/E acceptable for tech
'pb_ratio': {'max': 5, 'weight': 0.15}, # Higher P/B acceptable for tech
'roe': {'min': 0.2, 'weight': 0.2}, # Higher ROE expected for tech
# Other criteria...
},
'Financial': {
'pe_ratio': {'max': 12, 'weight': 0.1}, # Lower P/E expected for financials
'pb_ratio': {'max': 1.5, 'weight': 0.2}, # P/B more important for financials
'roe': {'min': 0.12, 'weight': 0.15}, # Different ROE expectations
# Other criteria...
},
# Other sectors...
}
# Modify agent to use sector-specific criteria
def analyze_with_sector(financial_data):
sector = financial_data.get('sector', 'Unknown')
criteria = sector_criteria.get(sector, default_criteria)
# Proceed with analysis using sector-specific criteria
4. Performance Optimization
Improve the technical performance of your agent:
- Caching: Cache API responses to reduce redundant calls
- Parallel Processing: Use multiprocessing for analyzing multiple stocks
- Batch Processing: Process stocks in batches to optimize API usage
- Error Handling: Improve robustness with better error handling
# Example of implementing caching
import functools
import time
# Simple time-based cache decorator
def cache_with_timeout(timeout_seconds=3600):
"""Cache function results with a timeout."""
cache = {}
def decorator(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
key = str(args) + str(kwargs)
current_time = time.time()
# Check if result is in cache and not expired
if key in cache and current_time - cache[key]['timestamp'] < timeout_seconds:
return cache[key]['result']
# Call the function and cache the result
result = func(*args, **kwargs)
cache[key] = {
'result': result,
'timestamp': current_time
}
return result
return wrapper
return decorator
# Apply to data fetching method
@cache_with_timeout(timeout_seconds=3600) # Cache for 1 hour
def fetch_data(ticker):
# Fetch data from API...
pass
A/B Testing Different Versions
Once you've developed multiple versions of your agent through optimization, you can compare them using A/B testing:
# ab_test_agents.py
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime
import yfinance as yf
# Define different agent versions
class BaseValueAgent:
"""Base value investing agent with original criteria."""
def __init__(self):
self.name = "Base Agent"
self.criteria = {
'pe_ratio': {'max': 15, 'weight': 0.15, 'better': 'lower'},
'pb_ratio': {'max': 3, 'weight': 0.15, 'better': 'lower'},
'roe': {'min': 0.15, 'weight': 0.15, 'better': 'higher'},
'debt_to_equity': {'max': 1.0, 'weight': 0.1, 'better': 'lower'},
'fcf_yield': {'min': 0.02, 'weight': 0.15, 'better': 'higher'},
'dividend_yield': {'min': 0.01, 'weight': 0.1, 'better': 'higher'},
'earnings_growth': {'min': 0.05, 'weight': 0.1, 'better': 'higher'},
'margin_of_safety': {'min': 0.2, 'weight': 0.1, 'better': 'higher'}
}
def analyze(self, financial_data):
"""Analyze company data using value investing criteria."""
# Implementation as before...
pass
class OptimizedWeightsAgent(BaseValueAgent):
"""Agent with optimized criteria weights."""
def __init__(self):
super().__init__()
self.name = "Optimized Weights Agent"
# Update weights based on optimization results
self.criteria['pe_ratio']['weight'] = 0.2
self.criteria['pb_ratio']['weight'] = 0.1
self.criteria['roe']['weight'] = 0.2
self.criteria['debt_to_equity']['weight'] = 0.05
self.criteria['fcf_yield']['weight'] = 0.2
self.criteria['dividend_yield']['weight'] = 0.05
self.criteria['earnings_growth']['weight'] = 0.1
self.criteria['margin_of_safety']['weight'] = 0.1
class OptimizedThresholdsAgent(BaseValueAgent):
"""Agent with optimized criteria thresholds."""
def __init__(self):
super().__init__()
self.name = "Optimized Thresholds Agent"
# Update thresholds based on optimization results
self.criteria['pe_ratio']['max'] = 20
self.criteria['pb_ratio']['max'] = 4
self.criteria['roe']['min'] = 0.12
self.criteria['debt_to_equity']['max'] = 1.2
self.criteria['fcf_yield']['min'] = 0.015
self.criteria['dividend_yield']['min'] = 0.005
self.criteria['earnings_growth']['min'] = 0.04
self.criteria['margin_of_safety']['min'] = 0.15
class SectorSpecificAgent(BaseValueAgent):
"""Agent with sector-specific criteria."""
def __init__(self):
super().__init__()
self.name = "Sector-Specific Agent"
# Define sector-specific criteria
self.sector_criteria = {
'Technology': {
'pe_ratio': {'max': 25, 'weight': 0.15, 'better': 'lower'},
'pb_ratio': {'max': 5, 'weight': 0.15, 'better': 'lower'},
'roe': {'min': 0.2, 'weight': 0.2, 'better': 'higher'},
'debt_to_equity': {'max': 1.2, 'weight': 0.05, 'better': 'lower'},
'fcf_yield': {'min': 0.015, 'weight': 0.2, 'better': 'higher'},
'dividend_yield': {'min': 0.005, 'weight': 0.05, 'better': 'higher'},
'earnings_growth': {'min': 0.08, 'weight': 0.1, 'better': 'higher'},
'margin_of_safety': {'min': 0.15, 'weight': 0.1, 'better': 'higher'}
},
'Financial': {
'pe_ratio': {'max': 12, 'weight': 0.1, 'better': 'lower'},
'pb_ratio': {'max': 1.5, 'weight': 0.2, 'better': 'lower'},
'roe': {'min': 0.12, 'weight': 0.15, 'better': 'higher'},
'debt_to_equity': {'max': 5.0, 'weight': 0.05, 'better': 'lower'},
'fcf_yield': {'min': 0.03, 'weight': 0.15, 'better': 'higher'},
'dividend_yield': {'min': 0.02, 'weight': 0.15, 'better': 'higher'},
'earnings_growth': {'min': 0.04, 'weight': 0.1, 'better': 'higher'},
'margin_of_safety': {'min': 0.2, 'weight': 0.1, 'better': 'higher'}
},
# Other sectors...
}
def analyze(self, financial_data):
"""Analyze company data using sector-specific criteria."""
sector = financial_data.get('sector', 'Unknown')
# Use sector-specific criteria if available, otherwise use default
if sector in self.sector_criteria:
original_criteria = self.criteria
self.criteria = self.sector_criteria[sector]
result = super().analyze(financial_data)
self.criteria = original_criteria
return result
else:
return super().analyze(financial_data)
def run_ab_test(agents, universe, start_date, end_date, rebalance_period='quarterly', top_n=5):
"""
Run A/B test comparing multiple agent versions.
Parameters:
-----------
agents : list
List of agent instances to compare
universe : list
List of ticker symbols to consider
start_date : str
Start date for the test
end_date : str
End date for the test
rebalance_period : str, optional
Rebalance frequency
top_n : int, optional
Number of stocks to include in each portfolio
Returns:
--------
dict
Test results
"""
results = {}
for agent in agents:
print(f"\nTesting {agent.name}...")
# Create backtest engine for this agent
backtest = BacktestEngine(
agent=agent,
start_date=start_date,
end_date=end_date,
rebalance_period=rebalance_period
)
# Run backtest
agent_results = backtest.run_backtest(universe=universe, top_n=top_n)
# Store results
results[agent.name] = agent_results
return results
def plot_ab_test_results(results, start_date, end_date):
"""Plot A/B test results."""
plt.figure(figsize=(12, 10))
# Plot portfolio values
plt.subplot(2, 1, 1)
# Get the first agent's dates for x-axis
first_agent = list(results.keys())[0]
dates = results[first_agent]['dates']
# Plot each agent's portfolio value
for agent_name, agent_results in results.items():
plt.plot(dates, agent_results['portfolio_values'], label=agent_name)
# Plot benchmark (S&P 500)
plt.plot(dates, results[first_agent]['benchmark_values'], label='S&P 500', linestyle='--')
plt.title('Portfolio Performance Comparison')
plt.xlabel('Date')
plt.ylabel('Value ($)')
plt.legend()
plt.grid(True)
# Plot performance metrics
plt.subplot(2, 1, 2)
# Extract metrics
agent_names = list(results.keys())
total_returns = [results[name]['total_portfolio_return'] for name in agent_names]
annualized_returns = [results[name]['annualized_portfolio_return'] for name in agent_names]
excess_returns = [results[name]['excess_return'] for name in agent_names]
max_drawdowns = [results[name]['max_portfolio_drawdown'] for name in agent_names]
# Add benchmark
agent_names.append('S&P 500')
total_returns.append(results[first_agent]['total_benchmark_return'])
annualized_returns.append(results[first_agent]['annualized_benchmark_return'])
excess_returns.append(0) # Benchmark excess return is 0 by definition
max_drawdowns.append(results[first_agent]['max_benchmark_drawdown'])
# Create bar chart
x = np.arange(len(agent_names))
width = 0.2
plt.bar(x - width*1.5, [r*100 for r in total_returns], width, label='Total Return (%)')
plt.bar(x - width/2, [r*100 for r in annualized_returns], width, label='Annualized Return (%)')
plt.bar(x + width/2, [r*100 for r in excess_returns], width, label='Excess Return (%)')
plt.bar(x + width*1.5, [r*100 for r in max_drawdowns], width, label='Max Drawdown (%)')
plt.xlabel('Agent')
plt.ylabel('Percentage (%)')
plt.title('Performance Metrics Comparison')
plt.xticks(x, agent_names, rotation=45)
plt.legend()
plt.grid(True, axis='y')
plt.tight_layout()
plt.savefig('ab_test_results.png')
plt.close()
print("A/B test results plot saved as 'ab_test_results.png'")
# Create a summary table
summary = pd.DataFrame({
'Agent': agent_names,
'Total Return (%)': [r*100 for r in total_returns],
'Annualized Return (%)': [r*100 for r in annualized_returns],
'Excess Return (%)': [r*100 for r in excess_returns],
'Max Drawdown (%)': [r*100 for r in max_drawdowns]
})
# Sort by annualized return
summary = summary.sort_values('Annualized Return (%)', ascending=False)
# Save to CSV
summary.to_csv('ab_test_summary.csv', index=False)
print("A/B test summary saved as 'ab_test_summary.csv'")
return summary
# Example usage
if __name__ == "__main__":
# Create agent instances
agents = [
BaseValueAgent(),
OptimizedWeightsAgent(),
OptimizedThresholdsAgent(),
SectorSpecificAgent()
]
# Define universe of stocks
universe = ['AAPL', 'MSFT', 'GOOGL', 'AMZN', 'META', 'BRK-B', 'JNJ', 'JPM', 'V', 'PG',
'UNH', 'HD', 'BAC', 'XOM', 'NVDA', 'DIS', 'ADBE', 'CRM', 'NFLX', 'CSCO']
# Run A/B test
results = run_ab_test(
agents=agents,
universe=universe,
start_date='2018-01-01',
end_date='2023-01-01',
rebalance_period='quarterly',
top_n=5
)
# Plot and summarize results
summary = plot_ab_test_results(results, '2018-01-01', '2023-01-01')
# Print the winner
winner = summary.iloc[0]['Agent']
winner_return = summary.iloc[0]['Annualized Return (%)']
print(f"\nThe best performing agent is: {winner} with an annualized return of {winner_return:.2f}%")
Knowledge Check
What is the primary purpose of backtesting a value investing AI agent?
Which of the following is NOT a common approach to optimizing a value investing AI agent?