How to Build a Multi-Agent System with Amazon Bedrock Agents in 2026
TL;DR
- Build a three-agent system where a supervisor coordinates specialized research and writing agents
- Deploy agents using Amazon Bedrock with Claude 3.5 Sonnet, AWS Lambda, and the Bedrock Agents API
- Implement agent-to-agent communication with shared state via DynamoDB
- By the end, you’ll have a working system that researches topics and generates reports automatically
Prerequisites
Before starting, ensure you have:
Required:
- AWS Account with Bedrock access enabled in
us-east-1orus-west-2 - AWS CLI v2.15+ configured with credentials (
aws configure) - Python 3.11+ installed locally
- boto3 1.34.50+ (
pip install boto3>=1.34.50) - Model access to Claude 3.5 Sonnet in Bedrock console (request if needed)
Knowledge:
- Basic Python and async programming
- Familiarity with AWS Lambda and IAM roles
- Understanding of REST APIs and JSON
Time: ~45 minutes
Cost: ~$2-5 for testing (Bedrock on-demand pricing, Lambda free tier eligible)
What We’re Building
We’re creating a multi-agent content generation system with three specialized agents:
- Supervisor Agent: Routes requests and coordinates other agents
- Research Agent: Searches information and compiles findings
- Writer Agent: Generates polished content from research
Architecture flow:
User Request → Supervisor Agent → Research Agent → Writer Agent → Final Output
↓ ↓ ↓
DynamoDB (shared state and conversation history)
This pattern is valuable for breaking complex tasks into specialized subtasks, improving output quality and reliability compared to single-agent approaches. It’s the foundation of autonomous agent systems used in production at scale.
Step 1: Set Up Your AWS Environment
We need IAM roles and policies that allow our agents to invoke each other and access Bedrock models.
Create a new directory and set up the project:
mkdir bedrock-multi-agent
cd bedrock-multi-agent
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install boto3 python-dotenv
Create an IAM policy document for our Lambda functions. Save this as agent-policy.json:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeAgent",
"bedrock:CreateAgent",
"bedrock:GetAgent"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"dynamodb:PutItem",
"dynamodb:GetItem",
"dynamodb:UpdateItem",
"dynamodb:Query"
],
"Resource": "arn:aws:dynamodb:*:*:table/AgentConversations"
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*"
}
]
}
Create the IAM role:
aws iam create-role \
--role-name BedrockAgentRole \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {"Service": ["lambda.amazonaws.com", "bedrock.amazonaws.com"]},
"Action": "sts:AssumeRole"
}]
}'
aws iam put-role-policy \
--role-name BedrockAgentRole \
--policy-name BedrockAgentPolicy \
--policy-document file://agent-policy.json
Expected output: You’ll see JSON confirming role creation with an ARN like arn:aws:iam::123456789012:role/BedrockAgentRole. Save this ARN for later.
Step 2: Create the Shared State Table
Our agents communicate through DynamoDB. Create a table to store conversation state:
aws dynamodb create-table \
--table-name AgentConversations \
--attribute-definitions \
AttributeName=conversationId,AttributeType=S \
AttributeName=timestamp,AttributeType=N \
--key-schema \
AttributeName=conversationId,KeyType=HASH \
AttributeName=timestamp,KeyType=RANGE \
--billing-mode PAY_PER_REQUEST \
--region us-east-1
What this does: Creates a table with conversationId as the partition key and timestamp as the sort key. This allows multiple agents to append messages to the same conversation thread in order.
Verify the table is active:
aws dynamodb describe-table --table-name AgentConversations --query 'Table.TableStatus'
Expected output: "ACTIVE"
Step 3: Build the Research Agent
The Research Agent searches for information and returns structured findings. Create research_agent.py:
import json
import boto3
import time
from datetime import datetime
bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('AgentConversations')
def lambda_handler(event, context):
"""
Research Agent: Searches and compiles information on a given topic.
"""
conversation_id = event['conversationId']
query = event['query']
# Store incoming request
store_message(conversation_id, 'user', query, 'research')
# Build research prompt
system_prompt = """You are a research specialist. Your job is to:
1. Analyze the research query
2. Identify 3-5 key aspects to investigate
3. Provide structured findings in JSON format
Return your response as valid JSON with this structure:
{
"topic": "main topic",
"key_findings": ["finding 1", "finding 2", ...],
"sources_needed": ["type of source 1", "type of source 2"],
"confidence": "high|medium|low"
}"""
prompt = f"Research the following topic thoroughly:\n\n{query}"
# Invoke Bedrock
response = bedrock.invoke_model(
modelId='anthropic.claude-3-5-sonnet-20241022-v2:0',
contentType='application/json',
accept='application/json',
body=json.dumps({
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 2000,
"system": system_prompt,
"messages": [
{
"role": "user",
"content": prompt
}
]
})
)
response_body = json.loads(response['body'].read())
research_findings = response_body['content'][0]['text']
# Store research results
store_message(conversation_id, 'assistant', research_findings, 'research')
return {
'statusCode': 200,
'conversationId': conversation_id,
'agent': 'research',
'findings': research_findings
}
def store_message(conversation_id, role, content, agent_type):
"""Store message in DynamoDB for inter-agent communication."""
table.put_item(
Item={
'conversationId': conversation_id,
'timestamp': int(time.time() * 1000),
'role': role,
'content': content,
'agent': agent_type,
'created_at': datetime.utcnow().isoformat()
}
)
Key points:
- The system prompt constrains the agent to structured JSON output, making it easier for other agents to parse
- We use
invoke_modeldirectly for fine-grained control (vs. the Agents API for simpler cases) - The
store_messagefunction creates a conversation log that the supervisor can query
Package this as a Lambda function:
zip research_agent.zip research_agent.py
aws lambda create-function \
--function-name ResearchAgent \
--runtime python3.11 \
--role arn:aws:iam::YOUR_ACCOUNT_ID:role/BedrockAgentRole \
--handler research_agent.lambda_handler \
--zip-file fileb://research_agent.zip \
--timeout 60 \
--memory-size 512 \
--region us-east-1
Replace YOUR_ACCOUNT_ID with your actual AWS account ID from Step 1.
Step 4: Build the Writer Agent
The Writer Agent takes research findings and generates polished content. Create writer_agent.py:
import json
import boto3
import time
from datetime import datetime
bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('AgentConversations')
def lambda_handler(event, context):
"""
Writer Agent: Transforms research findings into polished content.
"""
conversation_id = event['conversationId']
research_data = event['researchData']
content_type = event.get('contentType', 'blog_post')
# Store incoming request
store_message(conversation_id, 'system',
f"Writing {content_type} from research", 'writer')
# Build writing prompt
system_prompt = f"""You are an expert content writer. Your job is to:
1. Review the research findings provided
2. Create engaging, well-structured {content_type}
3. Include relevant details from the research
4. Write in a clear, professional tone
Format your output with:
- A compelling title
- An introduction
- 3-4 main sections
- A conclusion"""
prompt = f"""Using the following research findings, write a comprehensive {content_type}:
{research_data}
Make it informative and engaging for a technical audience."""
# Invoke Bedrock with longer output
response = bedrock.invoke_model(
modelId='anthropic.claude-3-5-sonnet-20241022-v2:0',
contentType='application/json',
accept='application/json',
body=json.dumps({
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 4000,
"system": system_prompt,
"messages": [
{
"role": "user",
"content": prompt
}
]
})
)
response_body = json.loads(response['body'].read())
written_content = response_body['content'][0]['text']
# Store final content
store_message(conversation_id, 'assistant', written_content, 'writer')
return {
'statusCode': 200,
'conversationId': conversation_id,
'agent': 'writer',
'content': written_content,
'word_count': len(written_content.split())
}
def store_message(conversation_id, role, content, agent_type):
"""Store message in DynamoDB for inter-agent communication."""
table.put_item(
Item={
'conversationId': conversation_id,
'timestamp': int(time.time() * 1000),
'role': role,
'content': content,
'agent': agent_type,
'created_at': datetime.utcnow().isoformat()
}
)
What’s different:
- Higher max_tokens (4000) for longer-form content
- The system prompt emphasizes structure and tone
- We pass
contentTypeto allow different output formats (blog, report, summary)
Deploy the Writer Agent:
zip writer_agent.zip writer_agent.py
aws lambda create-function \
--function-name WriterAgent \
--runtime python3.11 \
--role arn:aws:iam::YOUR_ACCOUNT_ID:role/BedrockAgentRole \
--handler writer_agent.lambda_handler \
--zip-file fileb://writer_agent.zip \
--timeout 90 \
--memory-size 512 \
--region us-east-1
Step 5: Build the Supervisor Agent
The Supervisor orchestrates the workflow, deciding which agent to call and when. Create supervisor_agent.py:
import json
import boto3
import time
import uuid
from datetime import datetime
lambda_client = boto3.client('lambda', region_name='us-east-1')
bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('AgentConversations')
def lambda_handler(event, context):
"""
Supervisor Agent: Orchestrates multi-agent workflow.
"""
user_query = event['query']
conversation_id = event.get('conversationId', str(uuid.uuid4()))
print(f"Starting conversation: {conversation_id}")
# Step 1: Analyze the request
task_type = analyze_request(user_query, conversation_id)
# Step 2: Execute research phase
print("Invoking Research Agent...")
research_response = invoke_agent(
'ResearchAgent',
{
'conversationId': conversation_id,
'query': user_query
}
)
research_findings = json.loads(research_response['Payload'].read())
print(f"Research complete: {research_findings['agent']}")
# Step 3: Execute writing phase
print("Invoking Writer Agent...")
writer_response = invoke_agent(
'WriterAgent',
{
'conversationId': conversation_id,
'researchData': research_findings['findings'],
'contentType': task_type
}
)
writer_output = json.loads(writer_response['Payload'].read())
print(f"Writing complete: {writer_output['word_count']} words")
# Step 4: Store final result
store_message(
conversation_id,
'system',
f"Workflow complete: {task_type}",
'supervisor'
)
return {
'statusCode': 200,
'conversationId': conversation_id,
'workflow': 'complete',
'content': writer_output['content'],
'metadata': {
'task_type': task_type,
'word_count': writer_output['word_count'],
'agents_used': ['research', 'writer']
}
}
def analyze_request(query, conversation_id):
"""
Use Claude to determine the type of content to generate.
"""
system_prompt = """Analyze the user's request and determine what type of content they need.
Respond with ONLY one word: blog_post, technical_report, summary, or article."""
response = bedrock.invoke_model(
modelId='anthropic.claude-3-5-sonnet-20241022-v2:0',
contentType='application/json',
accept='application/json',
body=json.dumps({
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 50,
"system": system_prompt,
"messages": [{"role": "user", "content": query}]
})
)
response_body = json.loads(response['body'].read())
task_type = response_body['content'][0]['text'].strip().lower()
store_message(conversation_id, 'system',
f"Task classified as: {task_type}", 'supervisor')
return task_type
def invoke_agent(function_name, payload):
"""Invoke another Lambda agent synchronously."""
return lambda_client.invoke(
FunctionName=function_name,
InvocationType='RequestResponse',
Payload=json.dumps(payload)
)
def store_message(conversation_id, role, content, agent_type):
"""Store message in DynamoDB."""
table.put_item(
Item={
'conversationId': conversation_id,
'timestamp': int(time.time() * 1000),
'role': role,
'content': content,
'agent': agent_type,
'created_at': datetime.utcnow().isoformat()
}
)
Key orchestration logic:
- analyze_request uses Claude to classify the task type (adding intelligence to routing)
- Sequential invocation: Research → Writer (could be parallelized for independent tasks)
- The supervisor maintains the conversation thread by passing the same
conversationId
Deploy the Supervisor:
zip supervisor_agent.zip supervisor_agent.py
aws lambda create-function \
--function-name SupervisorAgent \
--runtime python3.11 \
--role arn:aws:iam::YOUR_ACCOUNT_ID:role/BedrockAgentRole \
--handler supervisor_agent.lambda_handler \
--zip-file fileb://supervisor_agent.zip \
--timeout 120 \
--memory-size 512 \
--region us-east-1
Note the higher timeout (120s) since this agent waits for two other agents to complete.
Step 6: Create the Client Interface
Create a simple Python client to invoke the multi-agent system. Save as client.py:
import boto3
import json
import sys
lambda_client = boto3.client('lambda', region_name='us-east-1')
def run_multi_agent_workflow(query):
"""
Invoke the supervisor agent with a user query.
"""
print(f"\n🚀 Starting multi-agent workflow...")
print(f"Query: {query}\n")
response = lambda_client.invoke(
FunctionName='SupervisorAgent',
InvocationType='RequestResponse',
Payload=json.dumps({
'query': query
})
)
result = json.loads(response['Payload'].read())
if result['statusCode'] == 200:
print(f"✅ Workflow Complete!")
print(f"Conversation ID: {result['conversationId']}")
print(f"Task Type: {result['metadata']['task_type']}")
print(f"Word Count: {result['metadata']['word_count']}")
print(f"\n{'='*60}")
print(f"GENERATED CONTENT:")
print(f"{'='*60}\n")
print(result['content'])
return result
else:
print(f"❌ Error: {result}")
return None
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Usage: python client.py 'Your research query here'")
sys.exit(1)
query = ' '.join(sys.argv[1:])
run_multi_agent_workflow(query)
Testing Your Implementation
Now we test the complete multi-agent system:
python client.py "Explain how vector databases improve RAG systems for large language models"
Expected output:
🚀 Starting multi-agent workflow...
Query: Explain how vector databases improve RAG systems for large language models
✅ Workflow Complete!
Conversation ID: 8f3e9a2c-1b4d-4f8e-9c3a-7d6e5f4a3b2c
Task Type: technical_report
Word Count: 847
============================================================
GENERATED CONTENT:
============================================================
# Vector Databases: The Backbone of Effective RAG Systems
## Introduction
Retrieval-Augmented Generation (RAG) has emerged as a crucial pattern...
[full generated content follows]
Test with different query types:
# Should trigger "blog_post" classification
python client.py "Write about the latest trends in AI agent development"
# Should trigger "summary" classification
python client.py "Summarize the key benefits of Amazon Bedrock for enterprises"
Verify the conversation history in DynamoDB:
aws dynamodb query \
--table-name AgentConversations \
--key-condition-expression "conversationId = :cid" \
--expression-attribute-values '{":cid":{"S":"YOUR_CONVERSATION_ID"}}' \
--region us-east-1
Replace YOUR_CONVERSATION_ID with the ID from the output. You should see entries from all three agents in chronological order.
Step 7: Add Conversation Retrieval
To query conversation history programmatically, create get_conversation.py:
import boto3
import json
import sys
from boto3.dynamodb.conditions import Key
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('AgentConversations')
def get_conversation(conversation_id):
"""
Retrieve full conversation history for analysis.
"""
response = table.query(
KeyConditionExpression=Key('conversationId').eq(conversation_id),
ScanIndexForward=True # Sort by timestamp ascending
)
print(f"\n📜 Conversation History: {conversation_id}\n")
print(f"{'='*80}\n")
for item in response['Items']:
agent = item['agent']
role = item['role']
content = item['content'][:100] + "..." if len(item['content']) > 100 else item['content']
print(f"[{agent.upper()}] {role}:")
print(f" {content}\n")
return response['Items']
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python get_conversation.py CONVERSATION_ID")
sys.exit(1)
conversation_id = sys.argv[1]
get_conversation(conversation_id)
Usage:
python get_conversation.py 8f3e9a2c-1b4d-4f8e-9c3a-7d6e5f4a3b2c
This shows the inter-agent communication flow, useful for debugging and understanding agent behavior.
Common Issues & Fixes
Problem: ModelNotFoundException when invoking Bedrock
Cause: Claude 3.5 Sonnet access not enabled in your AWS region
Fix: Navigate to Bedrock console → Model access → Request access for anthropic.claude-3-5-sonnet-20241022-v2:0. Wait 5-10 minutes for activation.
Problem: Lambda times out with “Task timed out after 60.00 seconds” Cause: Bedrock responses can be slow, especially for long content Fix: Increase timeout for all functions:
aws lambda update-function-configuration \
--function-name ResearchAgent \
--timeout 90
aws lambda update-function-configuration \
--function-name WriterAgent \
--timeout 120
aws lambda update-function-configuration \
--function-name SupervisorAgent \
--timeout 180
Problem: AccessDeniedException when Lambda invokes another Lambda
Cause: IAM role missing lambda:InvokeFunction permission
Fix: Add to your IAM policy:
{
"Effect": "Allow",
"Action": "lambda:InvokeFunction",
"Resource": "arn:aws:lambda:*:*:function:*Agent"
}
Then update the role:
aws iam put-role-policy \
--role-name BedrockAgentRole \
--policy-name LambdaInvokePolicy \
--policy-document file://lambda-invoke-policy.json
Problem: DynamoDB returns empty results when querying conversations Cause: Query using the wrong key attribute or conversation ID doesn’t exist Fix: List all conversations to verify:
aws dynamodb scan \
--table-name AgentConversations \
--projection-expression "conversationId" \
--region us-east-1 | grep conversationId
Problem: High latency (>30s) for simple queries Cause: Cold start penalty for Lambda functions Fix: Enable provisioned concurrency for frequently-used agents:
aws lambda put-provisioned-concurrency-config \
--function-name SupervisorAgent \
--provisioned-concurrent-executions 1 \
--qualifier '$LATEST'
Warning: Provisioned concurrency adds cost (~$13/month per function). Use only for production workloads.
Next Steps
Add a Code Execution Agent: Create a fourth agent that can run Python code and return results. Useful for data analysis tasks:
- Install a sandboxed Python environment in Lambda
- Parse code from LLM output
- Execute and capture output
- Return results to the supervisor for further processing
Implement Parallel Agent Execution: Modify the supervisor to invoke Research and Writer agents simultaneously when tasks are independent:
import asyncio
async def invoke_agents_parallel():
tasks = [
invoke_agent_async('ResearchAgent', payload1),
invoke_agent_async('WriterAgent', payload2)
]
return await asyncio.gather(*tasks)
Add Human-in-the-Loop Approval: Before the Writer agent runs, send research findings to an SNS topic for human review:
- Create an SNS topic
- Publish research results
- Wait for approval via API Gateway endpoint
- Only proceed to writing after approval
Monitor with CloudWatch: Set up dashboards tracking:
- Agent invocation counts
- Average latency per agent
- Token usage and costs
- Error rates
Use this AWS CLI command to create a custom metric:
aws cloudwatch put-metric-data \
--namespace MultiAgentSystem \
--metric-name AgentInvocations \
--value 1 \
--dimensions Agent=Research
Challenge: Extend this system to handle image generation by adding a DALL-E or Stable Diffusion agent that creates visuals based on the written content. The supervisor should coordinate: Research → Writer → Image Generator → Final assembly.
Related Tutorials
- Building RAG Systems with Amazon Bedrock Knowledge Bases
- Optimizing Lambda Cold Starts for AI Workloads
- Advanced Prompt Engineering Techniques for Claude 3.5
You now have a production-ready multi-agent system that demonstrates agent specialization, orchestration, and state management — the three pillars of scalable AI agent architectures.