The Agent Stack #006 — Monday Build
Chrome 146 just shipped WebMCP support. That means any website can now expose tools for your AI agents to use.
The real story isn’t the API—it’s what happens when agents start hitting paid services at scale. Most builders are flying blind on costs until their OpenAI bill arrives.
Build: Cost estimation before deployment
The Agent-Audit repo tackles this head-on. It’s a linter that estimates costs before your agent runs wild.
Here’s how to build your own cost estimator:
import tiktoken
from typing import Dict, List
class AgentCostEstimator:
def __init__(self):
self.pricing = {
'gpt-4o': {'input': 0.0025, 'output': 0.010}, # per 1K tokens
'claude-3.5-sonnet': {'input': 0.003, 'output': 0.015},
}
self.enc = tiktoken.get_encoding("cl100k_base")
def estimate_workflow(self, workflow: Dict) -> float:
total_cost = 0
for step in workflow['steps']:
tokens_in = len(self.enc.encode(step['prompt']))
expected_out = step.get('max_tokens', 1000)
model = step.get('model', 'gpt-4o')
rates = self.pricing[model]
step_cost = (tokens_in * rates['input'] +
expected_out * rates['output']) / 1000
# Factor in retry attempts
retries = step.get('max_retries', 3)
step_cost *= (1 + retries * 0.3) # 30% retry rate assumption
total_cost += step_cost
return total_cost * 1.27 # Convert to GBP
The key insight: model your agent’s decision tree and trace token usage through each branch. Factor in retries, tool calls that fail, and multi-step reasoning loops.
Agent-Audit goes further—it parses your MCP configurations and warns about expensive tool combinations. If your agent can call both web search AND code execution in a loop, that’s a cost explosion waiting to happen.
Add budget limits directly to your agent runtime:
class BudgetGuard:
def __init__(self, daily_limit_gbp: float):
self.limit = daily_limit_gbp
self.spent = 0
def check_step(self, estimated_cost: float) -> bool:
if self.spent + estimated_cost > self.limit:
raise Exception(f"Budget exceeded: {self.spent + estimated_cost:.2f} > {self.limit}")
return True
This prevents the £2,800 Polymarket loss scenario where an agent burns through funds in 30 minutes.
Quick Hits
• Shannon Entropy for agent decisions: PicoAgents uses information theory to decide when to act vs ask for guidance. Only 2 dependencies.
• eBPF agent auditing: Logira records what your agent actually did at the OS level, not just what it claims it did. Essential for production.
• Isolated Chrome sessions: Chromectl gives agents their own browser session—no risk of wandering into your banking tab.
One Thing to Try
Run Agent-Audit on your existing workflow. Install with pip install agent-audit, point it at your configuration file, and see the estimated monthly cost. Most builders are shocked—agents that feel “cheap” during testing cost hundreds per month in production.
Next Monday: Building reliable tool calling with the new WebMCP standard.