The Agent Stack #044 — Wednesday Stack


Anthropic just dropped Claude Fable 5, and this isn’t another incremental model update. This is the first public Mythos-class model that actually works for building production agents.

I’ve been testing Fable 5 against Claude 3.5 Sonnet for the past 48 hours across three different agent workflows. The results are striking. Fable 5 consistently handles multi-step reasoning tasks that would trip up previous models, particularly when dealing with ambiguous instructions or error recovery.

The standout feature is its improved “constitutional AI” training. Unlike previous Claude models that would refuse tasks at the first sign of complexity, Fable 5 navigates edge cases intelligently. When I fed it a workflow that required scraping data, making API calls, and generating a report, it didn’t just execute the steps—it identified potential failure points and suggested fallbacks.

Code generation is where Fable 5 shines. It produced working TypeScript for a RAG pipeline that compiled on first run. No syntax errors, proper error handling, and clean separation of concerns. The model seems to understand software architecture patterns that earlier versions struggled with.

The pricing is aggressive: £0.15 per 1K input tokens vs £0.23 for GPT-4o. For agent workloads that burn through millions of tokens, this matters. One of my automation clients could cut their inference costs by 35% switching from GPT-4o to Fable 5 without quality loss.

Real-world testing reveals some rough edges. The model occasionally gets stuck in verbose explanation loops when simple acknowledgements would suffice. The safety guardrails, while improved, still trigger false positives on legitimate business automation tasks involving financial data processing.

Comparing against the competition, Fable 5 sits between GPT-4o and Claude 3.5 Sonnet for pure reasoning but exceeds both for practical agent building. The model’s ability to maintain context across long conversations (200K+ tokens) without degradation is particularly valuable for complex workflows.

The release timing is interesting. With OpenAI filing for IPO and Google cutting subscription prices, Anthropic is positioning Fable 5 as the practical choice for builders who need reliability over bells and whistles.

Quick Hits

Apple’s Siri AI finally handles calendar parsing from messy emails—parents everywhere rejoice, but the real test is whether it scales beyond basic scheduling tasks

Cohere released North Mini Code, a 1.5B parameter coding model that’s surprisingly competent at generating boilerplate but struggles with complex architectural decisions

Lovable claims £400M ARR from their AI coding platform—impressive if true, though the “1 million projects per week” metric needs scrutiny given typical project completion rates

One Thing to Try

Replace your current model in one non-critical agent workflow with Claude Fable 5. Test it on a task that requires 3+ reasoning steps with potential failure points. Compare token usage and output quality against your existing setup. The API endpoint is the same as regular Claude—just specify claude-3-fable-5 as the model parameter.

The agent infrastructure landscape just got more interesting. Choose your models like you choose your tools—for the job at hand, not the marketing hype.