Anthropic Releases Claude Sonnet 4.6: Opus-Level Intelligence at Sonnet Prices

The real story in Anthropic's Sonnet 4.6 launch isn't the model — it's the price tier.

Anthropic's headline is that Sonnet 4.6 approaches Opus-level intelligence at Sonnet pricing ($3/$15 per million tokens). Users in early access preferred it to the flagship Opus 4.5 model 59% of the time. That's a striking claim: the mid-range model beating the premium one on user preference.

What's notable

The computer use trajectory is the thing worth paying attention to. OSWorld scores have climbed steadily across sixteen months of Sonnet releases — from 14.9% with Sonnet 3.5 in October 2024, through 28.0% (Sonnet 3.7), 42.2% (Sonnet 4), and 61.4% (Sonnet 4.5), to 72.5% now. Anthropic describes early users seeing "human-level capability" on tasks like navigating complex spreadsheets and multi-step web forms. That's a meaningful shift — not because the benchmark number is magic, but because it moves computer use from "interesting demo" toward "actually deployable for boring office work."

The vending machine business simulation is a nice touch too: Sonnet 4.6 apparently figured out a spend-then-pivot strategy on Vending-Bench Arena, investing heavily in capacity for the first ten simulated months and then sharply pivoting to profitability in the final stretch, outperforming competitors. It's a controlled environment, but it gestures at the kind of multi-step planning that matters for real agentic work.

On coding, the reported improvements are practical rather than flashy: reading context before modifying code, consolidating shared logic instead of duplicating it, fewer false claims of success, and better follow-through on multi-step tasks. Developers with early Claude Code access said it was less frustrating over long sessions than earlier models. That sort of unglamorous reliability is arguably more valuable than a benchmark jump.

What the announcement leaves out

Worth noticing what's doing heavy lifting and what's buried mid-paragraph.

The "users preferred Sonnet 4.6 to Opus 4.5" framing is a specific context — preference in Claude Code coding sessions — not a general intelligence comparison. Anthropic still positions Opus 4.6 as the stronger choice for "the deepest reasoning," codebase refactoring, and multi-agent coordination. If your workload lives in that territory, Sonnet isn't replacing your Opus spend just yet.

The 1M token context window is "in beta," which in practice means you should test it thoroughly before trusting it with your entire codebase. Paired with context compaction — automatic summarisation of older context as conversations approach limits — it extends effective working memory considerably, but beta is beta.

And the computer use capability, while improving fast, still "lags behind the most skilled humans" — a caveat that deserves more prominence than it gets in the announcement. The trajectory is impressive. The destination isn't here yet.

Worth watching

Frontier capability is migrating down the price curve faster than most enterprise buyers have priced into their budgets. Anthropic shipped Opus 4.6 on 5 February and Sonnet 4.6 on 17 February — two major releases in under a fortnight, on the back of a $30 billion funding round at a $380 billion valuation. The pace is not slowing down.

If your AI strategy assumes you need the most expensive model for serious work, that assumption has a shorter shelf life than you think. The practical move is not to chase every release, but to revisit cost-benefit calculations quarterly. Workflows that were uneconomical six months ago may now pencil out — and the ones that pencil out today will likely get cheaper again by summer.

The model ID for API users is claude-sonnet-4-6. Available now across all Claude plans, the API, and major cloud platforms.

What's notable

What the announcement leaves out

Worth watching

More from the blog

Stay current weekly