VOOZH about

URL: https://platform.claude.com/docs/en/api/service-tiers

⇱ Service tiers - Claude API Docs


Service tiers
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Anthropic offers three service tiers:

  • Priority Tier: Best for workflows deployed in production where time, availability, and predictable pricing are important
  • Standard: Default tier for both piloting and scaling everyday use cases
  • Batch: Best for asynchronous workflows that can wait or benefit from being outside your normal capacity

Standard Tier

The standard tier is the default service tier for all API requests. The API prioritizes these requests alongside all other requests with best-effort availability.

Priority Tier

The API prioritizes requests in this tier over all other requests. This prioritization helps minimize "server overloaded" errors, even during peak times.

For more information, see Get started with Priority Tier

How requests get assigned tiers

When handling a request, Anthropic decides to assign a request to Priority Tier in the following scenarios:

  • Your organization has sufficient priority tier capacity input tokens per minute
  • Your organization has sufficient priority tier capacity tokens per minute

Was this page helpful?

output

Anthropic counts usage against Priority Tier capacity as follows:

Input Tokens

Output Tokens

Otherwise, requests proceed at standard tier.

These burndown rates reflect the relative pricing of each token type. For example, US-only inference is priced at 1.1x on Opus 4.6, Sonnet 4.6, and later models, so each token consumed with inference_geo: "us" draws down 1.1 tokens from your Priority Tier capacity.

Requests assigned Priority Tier pull from both the Priority Tier capacity and the regular rate limits. If servicing the request would exceed the rate limits, the request is declined.

Using service tiers

You can control which service tiers can be used for a request by setting the service_tier parameter:

Python
message = client.messages.create(
 model="claude-opus-4-8",
 max_tokens=1024,
 messages=[{"role": "user", "content": "Hello, Claude!"}],
 service_tier="auto", # Automatically use Priority Tier when available, fallback to standard
)
print(message.usage.service_tier)

The service_tier parameter accepts the following values:

The response usage object also includes the service tier assigned to the request:

{
 "usage": {
 "input_tokens": 410,
 "cache_creation_input_tokens": 0,
 "cache_read_input_tokens": 0,
 "output_tokens": 585,
 "service_tier": "priority"
 }
}

This allows you to determine which service tier was assigned to the request.

When requesting service_tier="auto" with a model with a Priority Tier commitment, these response headers provide insights:

anthropic-priority-input-tokens-limit: 10000
anthropic-priority-input-tokens-remaining: 9618
anthropic-priority-input-tokens-reset: 2025-01-12T23:11:59Z
anthropic-priority-output-tokens-limit: 10000
anthropic-priority-output-tokens-remaining: 6000
anthropic-priority-output-tokens-reset: 2025-01-12T23:12:21Z

You can use the presence of these headers to detect if your request was eligible for Priority Tier, even if it was over the limit.

Get started with Priority Tier

You may want to commit to Priority Tier capacity if you are interested in:

Committing to Priority Tier involves deciding:

The ratio of input to output tokens you purchase matters. Sizing your Priority Tier capacity to align with your actual traffic patterns helps you maximize utilization of your purchased tokens.

Supported models

Priority Tier is supported on all available Claude models (including Claude Fable 5 and Claude Opus 4.8) except Claude Mythos Preview and Claude Mythos 5.

Check the Models overview for more details on available models.

How to access Priority Tier

To begin using Priority Tier:

  1. Contact sales to complete provisioning.
  2. (Optional) Update your API requests to set the service_tier parameter to auto.
  3. Monitor your usage through response headers and the Claude Console.