If a third-party cloud isn’t a requirement, Claude for Enterprise is usually the better fit: richer admin capabilities, such as SCIM provisioning, and nothing to host. Subscriptions are available through AWS Marketplace, where purchases can count toward an AWS spend commitment. The gateway is designed for organizations that must route inference through their own cloud provider, for example to meet data residency requirements.
claude binary, so the same executable that runs Claude Code on a laptop runs the gateway server with claude gateway --config gateway.yaml.
This page covers:
- Why Claude apps gateway, what it adds over running your own, and when something else fits better
- A quickstart with prerequisites that takes a gateway from zero to a signed-in developer
- Connecting developers, including setting the gateway URL through managed settings
- Availability and limitations covering which Claude Code features work through the gateway and what the server supports
Why Claude apps gateway
The gateway overview covers what a gateway does and why you’d run one. Claude apps gateway is Anthropic’s own gateway, built into theclaude binary and tested alongside each Claude Code release, so it forwards the headers and request fields Claude Code sends without operators maintaining a separate allowlist. Once deployed it gives you:
- Credentials: the upstream API key or cloud credential lives only in your infrastructure. Developers authenticate with corporate SSO and receive short-lived bearer tokens, so offboarding happens in your IdP. Deprovision a user and their gateway access expires within the session lifetime, one hour by default.
- Access control: your IdP groups map to model allowlists and managed settings policies. The gateway enforces model access server-side, rejecting requests for non-granted models, and selects each group’s managed settings policy, which the CLI applies at the managed settings tier. Different teams get different models, tools, and permissions, and a developer can’t override what their policy locks.
- Settings delivery: the gateway delivers managed settings to signed-in clients itself, taking the place of server-managed settings from the claude.ai admin console.
- Telemetry: each configured destination, such as Datadog, Splunk, or ClickHouse, receives OpenTelemetry Protocol (OTLP) metrics with token counts, model, user identity, and latency by default, with logs and traces as per-destination opt-ins.
- Upstream routing: clients speak the Anthropic Messages API to the gateway, and the gateway translates for each upstream, whether Bedrock, Google Cloud’s Agent Platform, Foundry, or the Anthropic API, with failover between them. You can change regions, providers, or failover order without developers noticing or reconfiguring.
The gateway’s own data plane sends nothing to Anthropic infrastructure unless the Anthropic API is a configured upstream. You control where telemetry, audit logs, managed settings, and your developers’ IdP identity go, and the gateway sends none of them to Anthropic. For the remaining traffic the CLI process can send and how to close it, see Compliance posture.
Other gateway implementations
If you already run an LLM gateway or API gateway that meets your needs, keep using it; Other LLM gateways covers configuring Claude Code against it. The gateway protocol reference documents the contract Claude Code expects from any gateway: the endpoints it calls, the headers and body fields to forward, and what stops working when they’re stripped. A running Claude apps gateway serves a superset of that contract atGET /protocol, adding the Claude apps gateway-specific endpoints for SSO sign-in, managed settings delivery, and telemetry. Fetch it with curl https://claude-gateway.internal.example.com/protocol from any deployed gateway, such as the one the quickstart below produces. Breaking changes to the protocol are announced in advance, but indefinite backwards compatibility isn’t guaranteed.
Quickstart
This quickstart walks the minimal path: register an OAuth client in your IdP, write agateway.yaml, run the gateway alongside Postgres with Docker Compose, and verify sign-in end to end. It uses an Amazon Bedrock upstream; Google Cloud’s Agent Platform, Foundry, and the Anthropic API are equally supported by swapping the upstreams block as shown in the configuration reference. At the end you have a gateway a developer can /login to.
Deploy on your private network. Claude Code only connects to a gateway whose address is private. This is a security guard, because a trusted gateway can push settings that run commands on developer machines. Put the gateway behind an internal load balancer or VPN and give it a hostname that resolves to private IPs only.
Prerequisites
Have these in place before you start:| You need | Details |
|---|---|
| Claude Code v2.1.195 or later | The claude gateway subcommand and the gateway sign-in flow ship in v2.1.195. Earlier public builds don’t include them. Both the machine running the gateway server and each developer’s machine must be on v2.1.195 or later; run claude update to get the latest release. |
| OpenID Connect (OIDC) identity provider | Okta, Microsoft Entra ID, Google Workspace, Keycloak, or Dex, or any other OIDC-compliant IdP such as PingFederate. The gateway runs standard OIDC discovery and the authorization-code flow against it. SAML and LDAP are not supported. |
| PostgreSQL 14 or later | Backs the device sign-in flow, where the browser callback writes and the polling CLI reads, plus rate-limit counters. Any managed Postgres works, including the smallest tier. Without spend limits configured, the gateway stores a few KB of short-lived auth state; with spend limits, it also holds durable spend, audit, and identity tables that should be backed up. TLS via ?sslmode=require is recommended. |
| Model upstream | Amazon Bedrock credentials, Google Cloud credentials, a Microsoft Foundry resource, or an Anthropic API key. Multiple upstreams are supported with failover. |
| HTTPS | The gateway must be reachable over https:// from developer laptops and from any browser used for sign-in; the gateway serves the device-verification page on the same listener. Either provide a TLS cert via listen.tls, or run behind a TLS-terminating ingress and set listen.public_url. A plain http:// origin is accepted only on loopback, for local development. |
| Private-network address | At /login, Claude Code requires the gateway’s hostname or IP address to resolve only to private addresses: RFC 1918, CGNAT 100.64.0.0/10, IPv6 ULA fc00::/7, or loopback for local development. The check runs on each resolved IP, so if any address the name resolves to is public, /login rejects the URL. If developer machines route HTTPS through a corporate proxy, sign-in also requires the proxy host to resolve to private addresses; if it doesn’t, add the gateway host to NO_PROXY so the CLI connects directly. |
| Linux runtime | The gateway server runs only on the native Linux binary. macOS works for local development. Windows is not supported as a server platform. |
claude binary; download a pinned release as described in Install Claude Code. The server uses runtime features that aren’t available when Claude Code runs under Node. If you see requires the native binary at boot, switch to one of the standalone install methods.
Steps
1
Register an OAuth client in your IdP
Decide the gateway’s hostname first, because the redirect URI must match it. Create a new OIDC web application and set the redirect URI to
https://claude-gateway.<your-domain>/oauth/callback, where the host is the same value you set as listen.public_url in step 3. Note the client_id and client_secret. Per-IdP instructions are in Identity provider setup.2
Provision a PostgreSQL database
Any Postgres 14 or later works, including the smallest managed tier. The gateway runs its own schema migrations at boot, so the database user needs
CREATE TABLE permission. If your security policy prohibits DDL from application roles, pre-create the schema instead; see store.3
Write gateway.yaml
Secrets are read via This config is enough for a working sign-in loop with the default Bedrock model catalog. Once it’s running, add per-group RBAC and managed settings via
${ENV_VAR} expansion so the file itself can live in version control. Use a public_url hostname that resolves to a private IP on your network, because /login rejects public addresses. The minimal config has five sections, and every other field has a default:gateway.yaml
listen:
host: 0.0.0.0
port: 8080
# Required behind any TLS-terminating proxy. Used for the IdP
# redirect_uri and the discovery document.
public_url: https://claude-gateway.internal.example.com
oidc:
issuer: https://login.example.com # must serve /.well-known/openid-configuration
client_id: 0oa1example2
client_secret: ${OIDC_CLIENT_SECRET}
allowed_email_domains: [example.com] # reject id_tokens outside your org
userinfo_fallback: true # for IdPs whose id_token omits email/groups; harmless otherwise
session:
jwt_secret: ${GATEWAY_JWT_SECRET} # openssl rand -base64 32
ttl_hours: 1 # also bounds revocation latency on IdP deprovision
store:
postgres_url: ${GATEWAY_POSTGRES_URL} # add ?sslmode=require for managed Postgres
upstreams:
- provider: bedrock
region: us-east-1
auth: {} # empty: AWS default credential chain
# (IRSA, EC2/ECS task role, env vars, ~/.aws)
# Models are translated per upstream automatically. The built-in catalog
# maps claude-opus-4-8 to us.anthropic.claude-opus-4-8 and so on for every
# Bedrock-supported Claude model. Set false and add a `models:` list to
# expose only specific models.
auto_include_builtin_models: true
managed.policies, telemetry fan-out via telemetry, and multi-upstream failover, provisioned-throughput ARNs, or non-US regions via models.The Bedrock upstream needs an AWS principal with
bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream on both the inference-profile/us.anthropic.* ARNs and the underlying foundation-model/anthropic.* ARNs, and model access enabled in the Bedrock console for the Claude models you want. Supply the credential with IRSA on EKS, an ECS task role, or an EC2 instance profile rather than static keys. The upstreams reference has the full IAM details, the cross-cloud credential matrix, and the auth blocks for the other providers.4
Run it
Build a container image around the The gateway is a single Linux binary that reads the config, runs OIDC discovery against your IdP, applies its Postgres schema migrations, builds upstream clients, and starts listening. Boot is fail-closed for the config, the Postgres connection with a 5-second timeout, OIDC discovery, and upstream client construction. If any of those is unreachable or misconfigured, the gateway exits with an error rather than serving traffic in a degraded state.A successful boot doesn’t validate the inference path, because Bedrock and Agent Platform instance credentials resolve on the first request, not at boot.Watch stderr for the boot sequence. Log lines use the format If boot exits before the
claude binary that meets the image requirements, then run it alongside Postgres:docker-compose.yaml
services:
gateway:
image: <your-registry>/claude-gateway:<version>
ports: ["8080:8080"]
volumes: ["./gateway.yaml:/etc/claude/gateway.yaml:ro"]
environment:
OIDC_CLIENT_SECRET: ${OIDC_CLIENT_SECRET}
GATEWAY_JWT_SECRET: ${GATEWAY_JWT_SECRET}
GATEWAY_POSTGRES_URL: postgres://gw:pw@postgres/gateway
# AWS credentials: in production, omit these and use an instance
# role. For local Compose testing, pass through your own:
AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID}
AWS_SECRET_ACCESS_KEY: ${AWS_SECRET_ACCESS_KEY}
AWS_SESSION_TOKEN: ${AWS_SESSION_TOKEN}
depends_on:
postgres:
condition: service_healthy
postgres:
image: postgres:16-alpine
environment: { POSTGRES_USER: gw, POSTGRES_PASSWORD: pw, POSTGRES_DB: gateway }
healthcheck:
test: ["CMD-SHELL", "pg_isready -U gw"]
interval: 5s
volumes: ["pgdata:/var/lib/postgresql/data"]
volumes: { pgdata: }
[gateway] <timestamp> <level> <message>, audit events are single-line JSON with an evt field, and a startup banner, omitted below, prints between the migration and listening lines. You should see, in order:{"ts":"2026-06-10T17:03:21.114Z","evt":"config.load","path":"/etc/claude/gateway.yaml","sha256":"…"}
[gateway] 2026-06-10T17:03:21.408Z info migration 1 applied
[gateway] 2026-06-10T17:03:21.512Z info claude gateway listening on http://0.0.0.0:8080
claude gateway listening on line, the last line of stderr names the problem:- an unreachable Postgres
- a Postgres role without DDL permission
- an unreachable or invalid OIDC discovery document
- a config schema violation with the offending field path
claude gateway --config gateway.yaml. Set public_url to the ingress origin and bind listen to a loopback or cluster-internal address.5
Verify the auth surface
Three checks confirm the gateway can authenticate a real user before you hand it to a developer.The examples use the gateway’s public URL; for the local Compose setup without an ingress, substitute The response includes additional fields, such as Third, test the browser leg by opening
http://localhost:8080 in the first two checks. The third check opens verification_uri_complete, which is built from public_url, so for local Compose set public_url: http://localhost:8080 in gateway.yaml, and add http://localhost:8080/oauth/callback as a second redirect URI on the OAuth client from step 1, because the gateway builds the IdP redirect_uri from public_url. The verification link then opens in your local browser.In Windows PowerShell, run curl.exe; the bare curl is an alias for Invoke-WebRequest and rejects these flags.First, fetch the discovery document, which confirms the gateway is up, the config is valid, and all boot checks passed:curl -s https://claude-gateway.internal.example.com/.well-known/oauth-authorization-server | jq
{
"issuer": "https://claude-gateway.internal.example.com",
"device_authorization_endpoint": "…/oauth/device_authorization",
"token_endpoint": "…/oauth/token",
"grant_types_supported": ["urn:ietf:params:oauth:grant-type:device_code", "refresh_token"]
}
response_types_supported and scopes_supported.Second, request a device authorization, which confirms the device sign-in flow works and Postgres is reachable and writable:curl -s -X POST https://claude-gateway.internal.example.com/oauth/device_authorization | jq
{
"device_code": "…",
"user_code": "WDJB-MJHT",
"verification_uri": "https://claude-gateway.internal.example.com/device",
"verification_uri_complete": "https://claude-gateway.internal.example.com/device?user_code=WDJB-MJHT",
"expires_in": 600,
"interval": 5
}
verification_uri_complete in a browser and confirming the code. You should be redirected to your IdP’s sign-in page, and after signing in, land back on the gateway with a signed-in confirmation.Use the first failing check to locate the problem:- First check fails: boot didn’t complete; check stderr
- Second check fails: Postgres isn’t reachable from the gateway or the role can’t write; check the connection string and grants
- Third check doesn’t reach the IdP: check that the IdP’s redirect URI matches
https://<gateway>/oauth/callbackexactly - Third check reaches the IdP but bounces back with an error: read the gateway’s audit log, which records every auth rejection with the reason, such as
email domain not allowed
6
Log a developer in
This last step happens on a developer machine, not the server. Push the two managed settings keys shown in Set the gateway URL below to that machine, then run
/login, press Enter on the Cloud gateway screen, and complete the browser sign-in.Connect developers
Developers connect from their own laptops with one browser sign-in, using their corporate work account. They don’t need a claude.ai account, an API key, or a subscription, because requests to the model go through the gateway using the organization’s upstream credential. Connection is driven by the client-side managed settings you push via MDM, so there is no manual setup on the developer side; this section covers what the admin configures. The CLI fingerprints the gateway’s TLS leaf certificate on first connect and pins it per hostname. Publish the expected SHA-256 fingerprint alongside the gateway URL so developers have something to compare against. Get the fingerprint from the certificate file withopenssl x509 -noout -fingerprint -sha256 -in cert.pem; the /login prompt shows the first 16 characters of the digest as lowercase hexadecimal with no separators. When the certificate rotates, every developer sees the trust prompt again, so treat rotations as a planned event and republish the fingerprint.
Once signed in, the model picker shows the models in the developer’s availableModels allowlist, managed settings apply at startup and refresh hourly, and telemetry routes to your collector. Sessions refresh silently before ttl_hours expiry, and a failed refresh after IdP deprovisioning prompts a re-login.
Set the gateway URL
Set both keys in the per-OS managed settings file you deploy via MDM or directly on disk, and/login opens directly on the Cloud gateway screen with the URL filled in:
{
"forceLoginMethod": "gateway",
"forceLoginGatewayUrl": "https://claude-gateway.internal.example.com"
}
forceLoginGatewayUrl is ignored in a developer’s own settings files. forceLoginMethod alone, without a URL, leaves the developer at a “Contact your IT administrator” message. Both keys belong in the file you push to machines, not in the gateway’s managed.policies[].cli block, which only reaches clients that are already connected.
CI pipelines and remote machines
There is no service-token flow for unattended pipelines. Gateway sign-in always runs the browser device flow, so a CI job with no developer to approve the sign-in can’t authenticate; configure those against your provider directly. Once a developer has signed in, every Claude Code invocation on that machine uses the gateway session, including non-interactiveclaude -p runs and sessions started by the Agent SDK, and the gateway policy applies to all of them.
The device flow separates the polling CLI from the approving browser, so a remote development box with no display still works: the developer runs /login over SSH on the remote machine and opens the verification link in the browser on their laptop.
What’s enforced on developers
These guarantees apply to every signed-in gateway session.- Model access: requests for models the policy doesn’t grant return 400, and the
/modelpicker is filtered to the policy’savailableModelsallowlist. SetenforceAvailableModels: truein the policy so the Default option resolves to a model insideavailableModelsinstead of to Claude Code’s built-in default; without it, Default stays selectable and is rejected at request time if that model isn’t granted. - Telemetry destination: when telemetry forwarding is configured, the OTLP export endpoint is pinned to the gateway, and the gateway-pushed configuration overrides locally set
OTEL_*variables. - Credentials: the gateway token is the session’s only credential.
ANTHROPIC_AUTH_TOKEN,ANTHROPIC_API_KEY,apiKeyHelper, and any earlier claude.ai login are ignored while signed in, so developers don’t need to log out of claude.ai first. - Managed settings: locked keys can’t be overridden locally. The CLI applies the policy at startup and on each hourly poll.
- Startup: signed-in sessions exit at startup with an error after about 10 seconds when the gateway is unreachable, rather than starting without their settings.
- Deprovisioning: a session whose user is disabled in the IdP expires within
ttl_hourswhen the next refresh fails.
What the organization can see
Usage telemetry carries the developer’s identity, token counts, model, and latency to the organization’s collector. The gateway doesn’t log or store prompt or completion content. Whether richer telemetry such as logs and traces is collected, which can include commands and file paths, is the organization’s per-destination choice.Availability and limitations
The table covers which Claude Code features work when developers connect through the gateway, and what the gateway server itself supports. Where something isn’t supported, the Notes column gives the alternative. The gateway delivers theanthropic-beta values the CLI sends to every upstream, so operators don’t maintain a beta allowlist. For Bedrock, which ignores the header, the gateway moves the values into the request body’s anthropic_beta field; the other upstreams receive the header as sent. The CLI’s gateway-session beta set omits first-party-only betas and the extended-cache-ttl beta, which is why those rows below show as not available.
| Feature | Status | Notes |
|---|---|---|
| Inference forwarding (Bedrock, Agent Platform, Foundry, Anthropic) | Available | With per-upstream model translation and failover. The Bedrock upstream uses the bedrock-runtime endpoint and the AWS default credential chain; the Bedrock Mantle endpoint is not a supported upstream. |
| Model access and managed settings by IdP group | Available | Model access is enforced server-side; managed settings are delivered per IdP group and applied by the CLI at the managed settings tier |
| Telemetry fan-out (OTLP/HTTP) | Available | Identity-stamped per export; both protobuf and JSON encodings |
| OIDC: Okta, Entra, Google, Keycloak, Dex | Available | Tested. Other OIDC-compliant IdPs should work |
| Per-user and per-group spend limits | Available | See Spend limits |
| Server-side web search | Not available | The CLI can’t see which upstream provider the gateway routes to, so it can’t verify web search support and disables WebSearch on gateway sessions |
| Standard prompt caching | Available | cache_control breakpoints are forwarded to every upstream |
| 1-hour cache TTL | Not available | The CLI omits the extended-cache-ttl beta on gateway sessions, because not every upstream the gateway can route to supports the 1-hour TTL, so prompt caching through the gateway uses the 5-minute TTL; see the beta-header note above |
| Auto mode | Available with opt-in | Follows the third-party provider rules: set CLAUDE_CODE_ENABLE_AUTO_MODE=1, deliverable through the managed policy env block, and only the Opus models eligible on third-party providers can use it |
| First-party-only optimizations such as global cache scope and token-efficient tools | Not available | The CLI doesn’t enable them on gateway sessions; see the beta-header note above |
| OTLP/gRPC | Not supported | OTLP over HTTP only |
| SAML, LDAP, and other non-OIDC auth | Not supported | OIDC only. Front with an OIDC bridge if needed |
| Multi-tenant (multiple OIDC issuers) | Not supported | One issuer per gateway. Run separate instances |
| Windows server | Not supported | Deploy on Linux. macOS for local development only |
| Helm chart | Not available | The gateway runs as a standard stateless Deployment; see the deployment guide |
| Admin UI | Not available | Configuration is the YAML file; redeploy to change it |
Next steps
The quickstart leaves you with a minimal config running under Docker Compose. To take it further:- Expand
gateway.yamlbeyond the minimal config, for example to add per-group RBAC, multi-upstream failover, or telemetry destinations. The configuration reference covers every option. - Move from Compose to a production deployment on Kubernetes or Cloud Run, set up your IdP properly, and review the security model. The deployment and operations guide covers per-IdP setup, container image requirements, health probes, and troubleshooting.
- Put spend caps on individual developers or groups so a runaway workload can’t consume your whole commitment. Spend limits covers the admin API and how enforcement works.
- For a complete worked example on Google Cloud, with Cloud Run, Cloud SQL, and Secret Manager, see Deploy on Google Cloud.
