The Agent Stack #019 — Monday Build
The OpenClaw drama last week wasn’t just about Anthropic flexing their pricing muscles. It highlighted the biggest pain point for AI agent builders: your brilliant agent becomes useless when it hits rate limits or gets temporarily banned.
Here’s how to build agents that keep working when APIs fail.
The Problem with Single-Provider Agents
Most developers build agents that depend entirely on one LLM provider. When Claude goes down or your API key gets throttled, everything stops. OpenClaw’s creator learned this the hard way when Anthropic temporarily cut off access.
The solution? Build multi-provider failover into your agent architecture from day one.
Architecture: The Provider Router Pattern
Instead of calling Claude directly, route requests through a provider abstraction layer:
from typing import Optional, List
import asyncio
from dataclasses import dataclass
@dataclass
class Provider:
name: str
api_key: str
rate_limit: int # requests per minute
cost_per_token: float
priority: int # lower = higher priority
class ProviderRouter:
def __init__(self):
self.providers = [
Provider("claude-3-5-sonnet", os.getenv("ANTHROPIC_KEY"), 60, 0.000015, 1),
Provider("gpt-4", os.getenv("OPENAI_KEY"), 100, 0.00003, 2),
Provider("gemini-pro", os.getenv("GOOGLE_KEY"), 120, 0.000001, 3)
]
self.circuit_breakers = {}
async def complete(self, prompt: str) -> str:
for provider in sorted(self.providers, key=lambda x: x.priority):
if self._is_healthy(provider.name):
try:
return await self._call_provider(provider, prompt)
except RateLimitError:
self._mark_unhealthy(provider.name, 60) # 60 second cooldown
except APIError:
self._mark_unhealthy(provider.name, 300) # 5 minute cooldown
raise Exception("All providers failed")
Smart Degradation Strategies
When your primary provider fails, don’t just switch to a backup. Implement smart degradation:
- Task Routing: Send complex reasoning to Claude, simple tasks to cheaper models
- Quality Fallbacks: If GPT-4 fails, try GPT-3.5 with modified prompts
- Cost Optimisation: Automatically route to cheapest available provider for non-critical tasks
Store provider performance metrics in Redis. Track success rates, average response times, and costs per provider over rolling windows.
Implementation: Rate Limit Respect
Build rate limiting directly into your router:
import asyncio
from collections import defaultdict
import time
class RateLimiter:
def __init__(self):
self.calls = defaultdict(list)
async def wait_if_needed(self, provider: str, limit: int):
now = time.time()
# Clean old calls (older than 60 seconds)
self.calls[provider] = [t for t in self.calls[provider] if now - t < 60]
if len(self.calls[provider]) >= limit:
sleep_time = 60 - (now - self.calls[provider][0])
await asyncio.sleep(sleep_time)
self.calls[provider].append(now)
This prevents your agent from hammering APIs and getting banned.
Quick Hits
• Monitor provider health: Set up alerts when any provider’s error rate exceeds 10% over 5 minutes
• Cost tracking: Log every API call with provider, tokens, and cost to optimise routing decisions
• Graceful failures: Always return partial results rather than complete failures when possible
One Thing to Try
Implement a simple provider router for your current agent this week. Start with just two providers (Claude and GPT-4) and basic failover logic. You’ll be amazed how much more reliable your agent becomes.
The best agents aren’t the smartest ones. They’re the ones that keep working when everything else breaks.