The Weapon You Only Give to the Good Guys

Commentary5 min readPublished 2026-02-21AI Primer

Source: Anthropic

AI SecurityAI in PracticeCritical Thinking
Cover image for The Weapon You Only Give to the Good Guys

Anthropic announced Claude Code Security yesterday — a tool built into Claude Code that scans codebases for vulnerabilities and suggests patches. Not pattern-matching against known bug catalogues, which is what most static analysis tools do. Actually reading the code, reasoning about how components interact, and finding the kind of logic-level flaws that get exploited in real breaches. Their team used it to find over 500 vulnerabilities in production open-source projects — bugs that survived decades of expert review.

That number, if it survives responsible disclosure, is not marketing. It's a genuine shift in what automated security tooling can do.

The architecture is right: multi-stage self-verification, confidence ratings, human approval before anything gets patched. This is how you build a security tool that security teams will actually trust rather than mute after week two.

Here's where it gets interesting. Anthropic frames this as putting frontier defensive capability "squarely in the hands of defenders." The unstated problem: attackers don't need this tool. They need the underlying reasoning capability, which ships in every frontier model from every major lab. You can gate access to a product. You cannot gate access to a capability class once it exists in the wild.

The announcement also stays quiet on the number that matters most to any security team evaluating this: false positive rates. If you're finding a broader class of vulnerability — which they are, by design — you're also generating more findings that need triage. Severity ratings and confidence scores are helpful. But "we surface more stuff" is only an improvement if the signal-to-noise ratio actually gets better, not just different.

And then there's the access question. Limited research preview, Enterprise and Team customers only. Open-source maintainers get expedited free access, which is good. But the overall dynamic is familiar: the organisations with the biggest security budgets get the new defensive tools first, while the organisations most vulnerable to AI-enabled attacks — small companies, startups, under-resourced teams — wait. The defenders who need the biggest head start get it last.

None of this makes the product unimpressive. It's the opposite — it's impressive enough that the distribution question actually matters.

The real takeaway isn't about Anthropic. It's about the clock. AI-assisted vulnerability scanning is going to become the baseline expectation for competent security practice within two years. Your code is about to be measured against what AI can find, not what your current tools catch. The gap between those two things is the 500 bugs that just got surfaced in projects everyone assumed were clean.

Anthropic built a good tool and wrapped it in the language of public good. Both things are true. The language of public good is also, not coincidentally, the language you use when you're selling a security product to enterprises. That's fine. What matters is whether the tool actually works, and early evidence says it does.

The arms race in AI-assisted security is now officially on. The question was never whether it would start. The question is whether defenders can stay ahead of attackers who don't need an Enterprise licence to use a frontier model. Gating access to one product doesn't answer that. Shipping the capability faster to more defenders might.

Stay current weekly

Get new commentary and weekly AI updates in the AI Primer Briefing.