GPT Proto
Schuyler Stacy2026-07-01

Claude Sonnet 5: What's New, What It Costs, and How It Compares to Sonnet 4.6 (2026 Guide)

laude Sonnet 5 guide: what's new vs Sonnet 4.6, benchmarks, $2/$10 pricing, the tokenizer catch that changes your bill, and runnable API code.

Claude Sonnet 5: What's New, What It Costs, and How It Compares to Sonnet 4.6 (2026 Guide)

Anthropic shipped Claude Sonnet 5 on June 30, 2026, and within an hour my feeds were full of the same two questions: should I move off Sonnet 4.6, and is the "same list price as 4.6" claim actually true once you run real traffic through it? Those are the questions I care about too, so this guide is built around them rather than around the launch-post highlight reel. Everything technical here traces back to Anthropic's own launch post and docs; where a number comes from press reporting rather than a page I could read directly, I say so.

One thing up front, because a lot of early write-ups get it wrong: Sonnet 5's predecessor is Sonnet 4.6 (released February 2026), not Sonnet 4.5. If you're comparing against 4.5 or an even older 3.5, you're comparing across two upgrades, and the numbers below won't line up. I'll come back to that.

Table of contents

What Is Claude Sonnet 5?

Sonnet 5 is Anthropic's newest mid-tier model, pitched as its most agentic Sonnet so far: it plans, drives browsers and terminals, and runs multi-step tasks on its own. The framing Anthropic leads with is a cost-performance one — performance close to Opus 4.8, at Sonnet prices.

Why that framing matters before we get to mechanics: for most of the last year, the biggest agentic gains landed in the Opus tier, not Sonnet. So teams running high-volume agent work were stuck paying Opus rates for capability they mostly needed some of the time. Sonnet 5 is Anthropic's attempt to pull a chunk of that capability down a price tier. That's the "why," and it explains most of the design choices that follow.

The mechanics, from Anthropic's docs: it's a drop-in replacement for Sonnet 4.6, uses the API model ID claude-sonnet-5, ships with a 1M-token context window as both the default and the maximum, and supports up to 128k output tokens. Text and image in, text out.

  Claude Sonnet 5
Released June 30, 2026
API model ID claude-sonnet-5
Context window 1M tokens (default and max)
Max output 128k tokens
Input / output Text + image in, text out
Relationship to 4.6 Drop-in replacement

What's New vs Sonnet 4.6

If you already run Sonnet 4.6, the upgrade isn't only a weights bump. Three behavior changes will touch your code, per Anthropic's docs:

  • Adaptive thinking is on by default.
  • Manual extended thinking now returns a 400 error (it was already deprecated on 4.6).
  • Setting sampling parameters — temperature, top_p, top_k — to non-default values returns a 400 error.
    Two more changes matter operationally. Sonnet 5 exposes effort levels (low, medium, high, and xhigh), where higher effort spends more tokens on reasoning to buy more quality — and more cost. And it's the first Sonnet-tier model with real-time cyber safeguards enabled by default. Those safeguards have a quirk worth knowing before you wire this into an agent: when the model refuses a flagged request, the refusal comes back as a successful HTTP 200 with stop_reason: "refusal", not as an error. If your error handling assumes a refusal throws, it won't.

The change with the widest blast radius, though, is one nobody puts in a headline: the tokenizer. I'm giving it its own section because it changes the pricing math.

Key Features and Performance

Anthropic positions Sonnet 5 as a strict improvement over 4.6 across reasoning, tool use, coding, and knowledge work, and as close enough to Opus 4.8 that you can pick your point on the cost-performance curve with the effort dial. The specific benchmark figures below are as reported from Anthropic's launch materials and day-one coverage; treat the exact numbers as Anthropic-reported until you've read them in the System Card yourself.

Benchmark Sonnet 5 Sonnet 4.6 Opus 4.8
SWE-bench Pro (agentic coding) 63.2% 58.1% 69.2%
OSWorld-Verified (computer use) 81.2% 78.5%
Terminal-Bench 2.1 80.4% 67.0%
Humanity's Last Exam (with tools) 57.4% 46.8% 57.9%
GDPval-AA v2 (knowledge work) 1,618 1,615

Figures as reported from Anthropic's June 30, 2026 launch; dashes are cells I couldn't confirm.

Read that table honestly and two things stand out. On coding, Sonnet 5 clears 4.6 by a real margin but still sits below Opus 4.8  — it narrows the gap, it doesn't close it. On knowledge work, it edges Opus 4.8 by a hair (1,618 vs 1,615), which is the source of the "sometimes beats Opus" line you'll see repeated everywhere; it's true, but it's a hair, on one benchmark.

Here's the cost the highlight reel skips. Anthropic's own charts show Sonnet 5's best value lands at low and medium effort. Push it to xhigh, and reporting from launch day says the cost can climb past Opus 4.8 for similar quality. So "cheaper than Opus" is a claim with an asterisk: cheaper at the effort levels where you'd actually reach for a Sonnet, not necessarily when you crank it to the top. My take: if a task genuinely needs xhigh reasoning, price out Opus 4.8 before defaulting to Sonnet 5 — the intuition that the smaller model is always the cheaper call breaks down at the top of the dial.

On safety, Anthropic reports lower rates of hallucination, sycophancy, and undesirable behavior than 4.6, plus better resistance to malicious requests and prompt-injection hijacks. The caveat they publish themselves: Sonnet 5 still shows a higher rate of misaligned behavior than Opus 4.8 on their internal audit. And on cyber, it never produced a full working exploit in testing — deliberately low capability there, which is a feature if you're shipping a customer-facing agent and a limitation if sanctioned security work is your use case.

Claude Sonnet 5 Pricing: Is It Actually Cheaper?

The sticker prices — Anthropic direct, plus GPT Proto's current rate:

  Input / 1M Output / 1M
Sonnet 5 — introductory (through Aug 31, 2026) $2 $10
Sonnet 5 — standard (from Sep 1, 2026) $3 $15
Sonnet 5 — via GPT Proto (current) $1.6 $8
Sonnet 4.6 $3 $15
Opus 4.8 $5 $25

At standard pricing, Sonnet 5 costs the same per token as 4.6 and comes in well under Opus 4.8. Straightforward. Except the per-token price isn't the whole bill.

Sonnet 5 ships with a new tokenizer — the same one Anthropic introduced with Opus 4.7 — and the same text now maps to roughly 1.0 to 1.35× more tokens depending on content type. Anthropic is upfront about this in a footnote, and about the consequence: the introductory pricing is set so the move from 4.6 is roughly cost-neutral during the intro window. Read that carefully. It means the $2/$10 launch price isn't pure discount; part of it is offsetting the extra tokens. And it means that on September 1, when the price returns to $3/$15 — identical to 4.6 on paper — an equivalent request can cost you more than 4.6 did, because it's billing more tokens for the same text.

So, is Sonnet 5 cheaper than 4.6? The honest answer is: it depends on when and how hard you run it. Through August, effectively yes. After that, for the same workload, plan for a per-request cost that's flat to modestly higher, not lower — and measure your own prompts under the new tokenizer instead of trusting the per-token line.

If you call Claude through GPT Proto rather than direct, Sonnet 5 is live on the platform at $1.6 / 1M input and $8 / 1M output — 20% under Anthropic's $2/$10 introductory rate, which makes it the cheaper path while the launch pricing lasts. For context on the same platform, Sonnet 4.6 runs $2.4/$12 and Opus 4.8 $4/$20. The runnable calls in the next section point straight at Sonnet 5.

How to Use Claude Sonnet 5 via API

Sonnet 5 is now live on GPT Proto, so the runnable examples below call it directly on the Claude Sonnet 5 model page. If you're migrating from 4.6, this is the drop-in swap Anthropic advertises — change claude-sonnet-4-6 to claude-sonnet-5 and leave the rest of the call untouched.

GPT Proto exposes Claude in Anthropic's native Messages format at https://gptproto.com/v1/messages, with a Bearer token. cURL first:

curl --request POST "https://gptproto.com/v1/messages" \
  --header "Authorization: Bearer $GPTPROTO_API_KEY" \
  --header "Content-Type: application/json" \
  --header "anthropic-version: 2023-06-01" \
  --data '{
    "model": "claude-sonnet-5",
    "max_tokens": 1024,
    "messages": [
      { "role": "user", "content": "Explain what an agentic coding model does in two sentences." }
    ]
  }'

The same call in Python, using nothing but requests:

import os
import requests
 
resp = requests.post(
    "https://gptproto.com/v1/messages",
    headers={
        "Authorization": f"Bearer {os.environ['GPTPROTO_API_KEY']}",
        "Content-Type": "application/json",
        "anthropic-version": "2023-06-01",
    },
    json={
        "model": "claude-sonnet-5",
        "max_tokens": 1024,
        "messages": [
            {"role": "user", "content": "Explain what an agentic coding model does in two sentences."}
        ],
    },
    timeout=60,
)
resp.raise_for_status()
data = resp.json()
print(data["content"][0]["text"])

Two things to carry over when you migrate to Sonnet 5. First, the tokenizer change means your token counts — and therefore your bill and your max_tokens headroom — shift even if the text doesn't; re-measure before you assume your existing limits still fit. Second, if any of your current code sets temperature, top_p, or top_k to custom values, or turns on manual extended thinking, strip those out: on Sonnet 5 they return a 400.

One key on GPT Proto reaches Claude plus 200+ other models on a single balance, so you can benchmark Sonnet 5 against, say, a cheaper model on the same request path without juggling providers. The full list is on the model catalog.

Sonnet 5 vs Sonnet 4.6 (and if you're still on 4.5 or 3.5)

Here's the head-to-head I'd actually use to decide, with Opus 4.8 kept in as the reference ceiling:

  Sonnet 5 Sonnet 4.6 Opus 4.8
Positioning Near-Opus agentic work at Sonnet price Prior mid-tier standard Accuracy ceiling
Coding (SWE-bench Pro) 63.2% 58.1% 69.2%
Best value at Low / medium effort Hardest tasks
Standard price (in/out per 1M) $3 / $15 $3 / $15 $5 / $25
Context 1M up to 1M 200k
Cost caveat New tokenizer bills more tokens; xhigh can exceed Opus cost Familiar tokenizer Priced for the top end

My read: if you're on 4.6 and doing sustained agentic or coding work, Sonnet 5 is the upgrade to run — the coding and tool-use gains are real, and the intro pricing makes the trial cheap. If your workload is accuracy-critical at the very top — the kind of task where a wrong answer is expensive — keep Opus 4.8 in the loop and reserve Sonnet 5 for the high-volume middle. And if you find yourself reaching for xhigh effort constantly, that's a signal to price Opus 4.8, not a reason to assume Sonnet is cheaper.

Still on Sonnet 4.5 or 3.5? Then the upgrade case is even stronger, but I'd stop short of quoting you a clean "5 vs 3.5" benchmark, because Anthropic didn't publish one — every head-to-head figure above is against 4.6. What I can say is directional: two generations of agentic gains sit between 3.5 and 5, most of them in exactly the tool-use and long-task reliability that older Sonnets stopped short on. The practical move is the same regardless of where you're starting: point your code at Sonnet 5 on GPT Proto and let the effort dial do the rest.

Is It Worth It? An Honest Read

Day-one reception was split, and the split is informative. The praise clustered on price-to-value at the intro rate. The doubt clustered on the same question this guide keeps circling: whether it holds up once standard $3/$15 pricing returns and the tokenizer's token inflation is doing its quiet work.

The most useful signal I saw wasn't a benchmark — it was a hands-on review from a code-review tooling team that ran it on real work. Their finding cuts both ways, which is why I trust it: for writing and building code, Sonnet 5 was the strongest model they'd used at this tier. For reviewing code, it was a trade-off — cleaner, sharper comments, but it caught fewer bugs than the models they already run in production, at a slightly higher cost per review. They also noticed two habits that show up as productivity in agent loops: it tends to write tests before the feature, and it rewrites its own plan partway through a long task instead of marching a stale plan off a cliff. The catch they flagged matches Anthropic's own note — the new cyber safeguards will occasionally trip on legitimate security work.

Worth remembering: the glowing quotes in the launch post come from Anthropic's early-access partners. They're real, but they're curated, and a vendor picking its own testimonials isn't neutral evidence. The independent hands-on take — strong at generation, weaker and pricier at review — is the more honest baseline.

So who should switch? Teams doing high-volume coding, tool use, browser and terminal automation, or knowledge work where you were overpaying for Opus — switch, and do it during the intro window while the trial is cheap. Teams whose core loop is careful review or accuracy-critical decisions — test it, but don't retire your current model on launch day, and re-run your own eval suite before you commit.


Bottom line: Sonnet 5 is a real step up for agentic and coding work, and the intro pricing makes it cheap to try — just don't read "same list price as 4.6" as "same bill," and price Opus 4.8 before you crank the effort dial. You can call Claude Sonnet 5 on GPT Proto today at $1.6 / $8 — 20% under Anthropic's launch rate — on one balance and one API key, across 200+ models if you want to route cost-sensitive calls elsewhere.

Grace: Desktop Automator

Grace handles all desktop operations and parallel tasks via GPTProto to drastically boost your efficiency.

Start Creating
Grace: Desktop Automator
Related Models
Claude
Claude
The claude sonnet model represents a critical milestone in the evolution of artificial intelligence, offering a sophisticated balance between cognitive depth and operational velocity. Designed by Anthropic and hosted on GPTProto, claude sonnet is engineered for enterprise-grade tasks that require nuanced reasoning without the latency of larger models. By utilizing the claude sonnet api, developers can access a model that excels in coding, multilingual translation, and complex data extraction. With GPTProto, you can leverage claude sonnet via a streamlined ai infrastructure, ensuring your applications remain responsive and highly capable in a competitive landscape.
$ 12
20% off
$ 15
Claude
Claude
Claude Opus 4.8 offers top-tier reasoning and long-context handling. Use Claude for deep research or complex coding tasks. Opus 4.8 integrates via our unified API for reliable performance and easy scaling across diverse AI applications.
$ 20
20% off
$ 25
Claude
Claude
Claude Sonnet 5 is Anthropic's most agentic Sonnet model, released June 30, 2026, with performance close to Opus 4.8 at a lower price. On GPTProto the Sonnet 5 API runs from $1.6 / $8 per 1M tokens — roughly 20% below Anthropic's own rate — billed from a single balance shared across every model on the platform.
$ 8
20% off
$ 10
Google
Google
Nano Banana Lite API powers the Gemini 3.1 Flash-Lite model, delivering sub-5 second image generation. This lite vision tool is optimized for high-velocity workflows, offering 1K resolution and native image-to-image editing at scale.
$ 0.0202
40% off
$ 0.0336

FAQ

Is Claude Sonnet 5 available?

Yes. Anthropic released it on June 30, 2026, as the default model for Free and Pro plans, selectable for Max, Team, and Enterprise, and live in Claude Code and via the API. It's also live on GPTProto — you can call it on the Claude Sonnet 5 model page.

How much does Claude Sonnet 5 cost?

$2 per 1M input tokens and $10 per 1M output tokens through August 31, 2026, then $3 / $15. One caveat: it uses a new tokenizer that maps the same text to roughly 1.0–1.35× more tokens, so measure your own prompts rather than reading cost straight off the per-token line.

Is Sonnet 5 cheaper than Sonnet 4.6 in practice?

Through August, effectively yes. After the intro window it's the same per-token price as 4.6, but the new tokenizer bills more tokens for the same text, so an equivalent request can cost modestly more, not less. It depends on your volume and the date.

Can I swap my Sonnet 4.6 code straight to Sonnet 5?

Mostly. It's a drop-in replacement, so the model string is the main change. But re-measure token counts and max_tokens under the new tokenizer, and remove any custom sampling parameters (temperature, top_p, top_k) or manual extended thinking calls — those return a 400 on Sonnet 5.