Optimizing AI Agent Tool Selection with Amazon S3 Vectors
By Braincuber Team
Published on February 12, 2026
The promise of Agentic AI is an assistant that can do anything—from resetting passwords to spinning up Kubernetes clusters. But as you add more capabilities ("tools"), you hit a wall.
If your agent has access to 500 different API functions, you can't shove all 500 definitions into the LLM's context window every time a user says "Hello." It's slow, expensive, and confuses the model. The solution? Semantic Tool Retrieval.
In this tutorial, we'll build a "SaaS Control Plane" agent for a fictional platform, CloudOrbit. We'll use the new Amazon S3 Vectors integration with Bedrock Knowledge Bases to dynamically fetch only the relevant tools for the job.
The Problem: Context Pollution
- Latency: Processing 500+ tool schemas (JSON) takes seconds before the LLM even "thinks."
- Cost: You pay for input tokens. Sending 100KB of unused tool definitions on every turn burns budget.
- Accuracy: "Distractor" tools (e.g., DeleteUser vs. DisableUser) increase hallwayucination risks.
The Architecture: Retrieval-Augmented Tool Usage
Instead of hardcoding tool definitions, we store them as vectors.
| Step | Action | System |
|---|---|---|
| 1. Ingest | Convert 500+ Tool JSON definitions into embeddings. | Bedrock KB + S3 Vectors |
| 2. Query | User asks: "Why is the database slow?" | Agent (Orchestrator) |
| 3. Retrieve | Search vector DB for tools related to "slow database". | Bedrock Retrieve API |
| 4. Select | Top 5 Tools (e.g., check_db_metrics, list_slow_queries) are sent to LLM. |
Claude 3.5 Sonnet |
Step 1: Ingesting Tools into S3 Vectors
First, we format our tools as text documents. S3 Vectors is serverless, so we just drop the files in S3 and point Bedrock Knowledge Base to it.
{
"tool_name": "analyze_rds_performance_insights",
"description": "Retrieves performance metrics for an RDS database instance. Use this to diagnose high CPU, slow queries, or locking issues.",
"parameters": {
"type": "object",
"properties": {
"instance_id": {"type": "string", "description": "The DB identifier"},
"lookback_minutes": {"type": "integer", "description": "Minutes of history to analyze"}
},
"required": ["instance_id"]
}
}
Step 2: Semantic Retrieval Logic
Here is the Python logic that sits inside your agent's "Thought Process". It takes the user's messy request and retrieves the clean internal tools.
import boto3
import json
bedrock_agent = boto3.client('bedrock-agent-runtime')
bedrock = boto3.client('bedrock-runtime')
def get_relevant_tools(user_query, kb_id, top_k=5):
"""
Asks Bedrock KB to find tools semantically related to the query.
"""
response = bedrock_agent.retrieve(
knowledgeBaseId=kb_id,
retrievalQuery={'text': user_query},
retrievalConfiguration={
'vectorSearchConfiguration': {'numberOfResults': top_k}
}
)
# Parse the retrieved tool definitions
tools = []
for result in response['retrievalResults']:
# Assuming the tool JSON is stored in the content
tool_def = json.loads(result['content']['text'])
tools.append(tool_def)
return tools
def agent_execution_step(user_input):
# 1. Retrieve only relevant tools
related_tools = get_relevant_tools(user_input, "KB_ID_12345", top_k=5)
# 2. Construct prompt with ONLY those 5 tools
system_prompt = f"""
You are a DevOps assistant. Use the following tools if needed:
{json.dumps(related_tools, indent=2)}
"""
# 3. Invoke Claude to select the tool
# ... (Standard Bedrock Converse API call)
Results: Cost vs. Performance
By moving from a "brute force" context (500 tools) to "semantic retrieval" (5 tools), the metrics for CloudOrbit's agent improved dramatically:
- Token Savings: 92% reduction in input tokens per turn (saving ~$0.18 per query on Claude 3.5 Sonnet).
- Latency: 21% faster time-to-first-token because the LLM processes less input.
- Accuracy: Tool selection accuracy increased from 75% to 82% because the "noise" of irrelevant tools was removed.
Conclusion
S3 Vectors makes vector storage trivial. You don't need to spin up a dedicated vector DB instance; you pay per query. For Agentic AI, this pattern—retrieving tools dynamically—is the key to unlocking "Super Agents" that can wield thousands of tools without breaking the bank (or their context window).
Scaling Your AI Agents?
Don't let context limits hold you back. Let our team architect a scalable Tool Retrieval system for your enterprise agents.
