Red Hat's Tank OS makes AI agents actually safe

The Agent Stack #026 — Wednesday Stack Tank OS just landed and it’s the first enterprise-grade solution I’ve seen that actually addresses the “Claude deleted our database” problem. After watching five documented agent failures in 36 days (with zero self-detection), Red Hat’s new containerisation approach for OpenClaw deployments isn’t just timely—it’s essential. I’ve been testing Tank OS for three weeks in our staging environment. Here’s what actually works and what doesn’t. ...

April 29, 2026 · 3 min · Rob Taylor

Ravix agent runs on Claude subscriptions, no API keys

The Agent Stack #023 — Wednesday Stack The agent infrastructure game just shifted. While everyone’s building agents that burn through API credits faster than a Formula 1 car burns fuel, Ravix took a different approach. Subscription-Based Agent Infrastructure Ravix runs on your existing Claude subscription instead of requiring separate API keys. Setup takes 60 seconds with a single command. The agent gets its own email address and starts listening for work from your Gmail immediately. ...

April 22, 2026 · 2 min · Rob Taylor

Chrome Skills turn prompts into production workflows

The Agent Stack #020 — Wednesday Stack Google just shipped Chrome Skills, and it’s the first browser-native agent tool that actually works in production. After testing it against 47 different workflows, I can tell you why this matters more than the flashier agent frameworks getting all the attention. Chrome Skills: The Agent Runtime We’ve Been Waiting For Chrome Skills lets you save any Gemini prompt as a reusable “Skill” that runs across multiple tabs. Sounds simple. The implementation is brilliant. ...

April 15, 2026 · 3 min · Rob Taylor

Anthropic's Mythos finds bugs everywhere

The Agent Stack #017 — Wednesday Stack Anthropic just dropped their most aggressive AI model yet. Mythos isn’t for chatting about your weekend plans. It’s designed to break things. And it’s terrifyingly good at it. The Glasswing Project Reality Check Anthropic partnered with Nvidia, Google, AWS, Apple, and Microsoft for Project Glasswing. The pitch? Use Mythos to find security vulnerabilities before the bad actors do. Early results are sobering. The model found exploitable bugs “in every major operating system and web browser” during initial testing. That’s Windows, macOS, Linux, Chrome, Safari, Firefox - the lot. ...

April 8, 2026 · 3 min · Rob Taylor

LiteLLM's security disaster exposes AI supply chain risks

The Agent Stack #014 — Wednesday Stack The AI gateway everyone trusts just got compromised. LiteLLM, used by thousands of developers to manage model API calls, fell victim to credential-stealing malware via their security compliance partner Delve. This isn’t just another breach story—it’s a wake-up call about the fragile infrastructure we’re building AI agents on. What Actually Happened LiteLLM serves as a proxy layer between your applications and model providers like OpenAI, Anthropic, and Cohere. Think of it as the plumbing that routes your API calls, handles rate limiting, and logs usage. The ishaan-jaff/litellm repo has 13.7k stars and gets downloaded millions of times monthly. ...

April 1, 2026 · 3 min · Rob Taylor

# ToolGuard: Testing AI Agent Functions Before They Fail

The Agent Stack #009 — Wednesday Stack Building reliable AI agents means your tools can’t crash when the LLM does something unexpected. Which happens constantly. I’ve been testing ToolGuard, the “pytest for AI agent tool calls” that launched this week. The premise is simple: fuzz your Python tool functions with edge cases before your agent calls them in production. Missing JSON keys, type mismatches, 10MB payloads, null values — all the creative ways LLMs break your assumptions. ...

March 18, 2026 · 2 min · Rob Taylor

Anthropic's enterprise agent push hits production reality

The Agent Stack #004 — Wednesday Stack Anthropic just shipped Claude Cowork plugins for finance, engineering, and design work. This isn’t another AI assistant announcement. It’s the first serious attempt to replace actual SaaS workflows with agents. I’ve been testing the Google Workspace integration for three days. The promise is simple: tell Claude to “analyse Q4 expenses and create a budget proposal”, and it connects to Sheets, pulls data, runs calculations, and drafts documents. In practice, it’s more like having a very capable intern who needs constant supervision. ...

February 25, 2026 · 3 min · Rob Taylor