Building AI agents that survive API rate limits

The Agent Stack #019 — Monday Build

The OpenClaw drama last week wasn’t just about Anthropic flexing their pricing muscles. It highlighted the biggest pain point for AI agent builders: your brilliant agent becomes useless when it hits rate limits or gets temporarily banned.

Here’s how to build agents that keep working when APIs fail.

The Problem with Single-Provider Agents

Most developers build agents that depend entirely on one LLM provider. When Claude goes down or your API key gets throttled, everything stops. OpenClaw’s creator learned this the hard way when Anthropic temporarily cut off access.

The solution? Build multi-provider failover into your agent architecture from day one.

Architecture: The Provider Router Pattern

Instead of calling Claude directly, route requests through a provider abstraction layer:

from typing import Optional, List
import asyncio
from dataclasses import dataclass

@dataclass
class Provider:
    name: str
    api_key: str
    rate_limit: int  # requests per minute
    cost_per_token: float
    priority: int  # lower = higher priority

class ProviderRouter:
    def __init__(self):
        self.providers = [
            Provider("claude-3-5-sonnet", os.getenv("ANTHROPIC_KEY"), 60, 0.000015, 1),
            Provider("gpt-4", os.getenv("OPENAI_KEY"), 100, 0.00003, 2),
            Provider("gemini-pro", os.getenv("GOOGLE_KEY"), 120, 0.000001, 3)
        ]
        self.circuit_breakers = {}
    
    async def complete(self, prompt: str) -> str:
        for provider in sorted(self.providers, key=lambda x: x.priority):
            if self._is_healthy(provider.name):
                try:
                    return await self._call_provider(provider, prompt)
                except RateLimitError:
                    self._mark_unhealthy(provider.name, 60)  # 60 second cooldown
                except APIError:
                    self._mark_unhealthy(provider.name, 300)  # 5 minute cooldown
                    
        raise Exception("All providers failed")

Smart Degradation Strategies

When your primary provider fails, don’t just switch to a backup. Implement smart degradation:

Task Routing: Send complex reasoning to Claude, simple tasks to cheaper models
Quality Fallbacks: If GPT-4 fails, try GPT-3.5 with modified prompts
Cost Optimisation: Automatically route to cheapest available provider for non-critical tasks

Store provider performance metrics in Redis. Track success rates, average response times, and costs per provider over rolling windows.

Implementation: Rate Limit Respect

Build rate limiting directly into your router:

import asyncio
from collections import defaultdict
import time

class RateLimiter:
    def __init__(self):
        self.calls = defaultdict(list)
    
    async def wait_if_needed(self, provider: str, limit: int):
        now = time.time()
        # Clean old calls (older than 60 seconds)
        self.calls[provider] = [t for t in self.calls[provider] if now - t < 60]
        
        if len(self.calls[provider]) >= limit:
            sleep_time = 60 - (now - self.calls[provider][0])
            await asyncio.sleep(sleep_time)
        
        self.calls[provider].append(now)

This prevents your agent from hammering APIs and getting banned.

Quick Hits

• Monitor provider health: Set up alerts when any provider’s error rate exceeds 10% over 5 minutes • Cost tracking: Log every API call with provider, tokens, and cost to optimise routing decisions
• Graceful failures: Always return partial results rather than complete failures when possible

One Thing to Try

Implement a simple provider router for your current agent this week. Start with just two providers (Claude and GPT-4) and basic failover logic. You’ll be amazed how much more reliable your agent becomes.

The best agents aren’t the smartest ones. They’re the ones that keep working when everything else breaks.

The Problem with Single-Provider Agents#

Architecture: The Provider Router Pattern#

Smart Degradation Strategies#

Implementation: Rate Limit Respect#

Quick Hits#

One Thing to Try#