VOOZH about

URL: https://thenewstack.io/anthropics-claude-sonnet-4-model-gets-a-1m-token-context-window/

⇱ Anthropic's Claude Sonnet 4 Model Gets a 1M Token Context Window - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2025-08-12 09:00:40
Anthropic's Claude Sonnet 4 Model Gets a 1M Token Context Window
AI / AI Agents

Anthropic’s Claude Sonnet 4 Model Gets a 1M Token Context Window

Anthropic today updated its Claude Sonnet 4 model to support a context window of up to 1 million tokens.
Aug 12th, 2025 9:00am by Frederic Lardinois
👁 Featued image for: Anthropic’s Claude Sonnet 4 Model Gets a 1M Token Context Window

Anthropic today announced that its Claude Sonnet 4 model, the company’s mainstream model that sits below its flagship Claude Opus 4 model, will now support a 1 million token context window. This long context support is now in public beta and available through the Anthropic API and on Amazon Bedrock, with support on Google’s Vertex AI coming soon.

A million tokens is the rough equivalent of 750,000 words, allowing the model to reason over a large amount of data without the developers having to resort to more complex techniques like retrieval-augmented generation (RAG).

When Anthropic launched its latest generation of models in May, both Sonnet 4 and Opus 4 were restricted to a context window of 200,000 tokens. That’s enough context for many use cases, but as far back as early 2024, Google, for example, offered a 1 million token context window for its Gemini models, with the promise to make a 2 million token context window widely available soon. OpenAI followed suit earlier this year with the launch of GPT-4.1, which also supported a 1 million token context window (but then GPT-5 brought that down to 400,000 tokens again).

There’s been no word on when (or if) Opus 4 will get the same upgrade.

As Anthropic notes in today’s announcement, long context will allow the models to evaluate more of a given code base, for example, (and coding is where Claude has long excelled), synthesize larger document sets and build AI agents that can maintain context even after hundreds of tool calls.

All of this comes at a price, though, with prompts that exceed the old 200,000 token limit costing twice as much per 1 million input tokens ($6 vs. $3) and 50% more per 1 million output tokens. Anthropic notes that prompt caching can help reduce cost (and latency) and stresses that its batch processing mode can also help bring the cost down by 50%.

It’s worth noting that there has been some discussion around how well large language models work with these extremely large context windows. Often, the benchmark for this is the needle-in-a-haystack test, which asks the model to find a specific piece of data in the context window. There, most models now perform quite well.

As some researchers have pointed out, though, that’s not necessarily how developers use these context windows in practice. Indeed, models often struggle to keep coherence as the session length — and with it, the context size — expands, for example.

Because of this, context engineering likely won’t be going away anytime soon, even as context windows increase in size.

TRENDING STORIES
Before joining The New Stack as its senior editor for AI, Frederic was the enterprise editor at TechCrunch, where he covered everything from the rise of the cloud and the earliest days of Kubernetes to the advent of quantum computing....
Read more from Frederic Lardinois
SHARE THIS STORY
TRENDING STORIES
AWS is a sponsor of The New Stack.
TNS owner Insight Partners is an investor in: Anthropic, OpenAI.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.