This page walks through one way to run Claude apps gateway on Google Cloud. The configuration is a working example for customer-managed infrastructure rather than a supported production deployment; use it to see how the pieces fit together before adapting it to your own environment. For the platform-agnostic requirements, see the deployment guide.
oidc block changes. See Identity provider setup for per-IdP details.
What you’ll build
The reference configuration provisions:- Cloud Run service or GKE Deployment running the gateway container
- Artifact Registry repository for the gateway image
- Cloud SQL for PostgreSQL instance, private IP only, for the gateway’s store
- Secret Manager secrets for
gateway.yaml, the JWT signing key, the OIDC client secret, and the Postgres URL - Service account with
roles/aiplatform.user, attached directly on Cloud Run or bound via Workload Identity on GKE - Internal Application Load Balancer on Cloud Run, or an internal GKE Ingress of class
gce-internalon GKE, for HTTPS
Prerequisites
- A GCP project with billing enabled, and permission to create the resources above
- The
gcloudCLI, authenticated withgcloud auth login, and Docker installed locally - For the GKE track:
kubectl, and a GKE cluster on the VPC created in the walkthrough below - Access to the Claude models you need in Model Garden, in a region that publishes them
- A Google Workspace OAuth 2.0 web-application client with redirect URI
https://<gateway-host>/oauth/callback; see Identity provider setup - A TLS hostname for the gateway, typically an internal DNS name pointing at the load balancer
export PROJECT_ID=<your-project>
export REGION=us-east5 # a region where the Claude models you need are published in Model Garden
gcloud config set project "$PROJECT_ID"
Deploy the gateway
The steps below provision the full deployment withgcloud commands.
1
Enable APIs
Enable the service APIs the walkthrough uses:The APIs you need depend on the deployment path:
gcloud services enable \
aiplatform.googleapis.com \
artifactregistry.googleapis.com \
sqladmin.googleapis.com \
secretmanager.googleapis.com \
iamcredentials.googleapis.com \
iam.googleapis.com \
compute.googleapis.com \
servicenetworking.googleapis.com \
run.googleapis.com \
container.googleapis.com
computeandservicenetworking: needed for the private-IP Cloud SQL pathrun: Cloud Run onlycontainer: GKE only
2
Create the service account and grant IAM
The gateway runs as a dedicated service account with permission to call Agent Platform. It reaches Cloud SQL over the VPC with a password user, so no Cloud SQL IAM role is required:Then enable the Claude models for the project in Model Garden; models publish to specific regions, so check each model card.
gcloud iam service-accounts create claude-gateway --display-name="Claude apps gateway"
SA="claude-gateway@${PROJECT_ID}.iam.gserviceaccount.com"
gcloud projects add-iam-policy-binding "$PROJECT_ID" \
--member="serviceAccount:${SA}" --role="roles/aiplatform.user" --condition=None
3
Build and push the image to Artifact Registry
Build the image per the container image requirements, using the
linux-x64 glibc binary, and push it:gcloud artifacts repositories create claude-gateway \
--repository-format=docker --location="$REGION"
gcloud auth configure-docker "${REGION}-docker.pkg.dev" --quiet
# Cloud Run requires linux/amd64. --provenance=false avoids a buildx OCI
# image index that Cloud Run rejects.
docker build --platform=linux/amd64 --provenance=false \
-t "${REGION}-docker.pkg.dev/${PROJECT_ID}/claude-gateway/gateway:<version>" .
docker push "${REGION}-docker.pkg.dev/${PROJECT_ID}/claude-gateway/gateway:<version>"
4
Provision Cloud SQL for PostgreSQL
Create the instance on a VPC via Private Services Access so it has no public IP; this also satisfies projects where The Cloud Run or GKE runtime must be on, or routed into, this VPC.
constraints/sql.restrictPublicIp is enforced:VPC=cc-gateway-vpc
gcloud compute networks create "$VPC" --subnet-mode=custom
gcloud compute networks subnets create cc-gateway-subnet \
--network="$VPC" --region="$REGION" --range=10.0.0.0/24
# Private Services Access: one-time per VPC
gcloud compute addresses create "google-managed-services-${VPC}" \
--global --purpose=VPC_PEERING --prefix-length=16 --network="$VPC"
gcloud services vpc-peerings connect \
--service=servicenetworking.googleapis.com \
--ranges="google-managed-services-${VPC}" --network="$VPC"
gcloud sql instances create claude-gateway-db \
--database-version=POSTGRES_16 --tier=db-g1-small --region="$REGION" \
--network="projects/${PROJECT_ID}/global/networks/${VPC}" --no-assign-ip
gcloud sql databases create claude_gateway --instance=claude-gateway-db
PGPASS="$(openssl rand -hex 24)"
gcloud sql users create gateway --instance=claude-gateway-db --password="$PGPASS"
PRIVATE_IP="$(gcloud sql instances describe claude-gateway-db \
--format='value(ipAddresses[0].ipAddress)')"
GATEWAY_POSTGRES_URL="postgres://gateway:${PGPASS}@${PRIVATE_IP}:5432/claude_gateway?sslmode=require"
5
Write gateway.yaml
The
The example below uses the internal-load-balancer-in-front-of-Cloud-Run values.
upstreams block points at Agent Platform with auth: {}, so the gateway authenticates via Application Default Credentials from the runtime service account. See the configuration reference for every field.Two listen fields depend on what fronts the gateway:public_url: required behind Cloud Run or a GKE Ingress. The gateway builds the IdPredirect_uriand its discovery document only from this value, never fromX-Forwarded-*headers.trusted_proxies: the front end’s source ranges. The gateway honorsX-Forwarded-Foronly when the TCP peer is in this list, then walks the chain past trusted hops, so per-IP sign-in rate limits and audit events record developer IPs instead of the load balancer’s.
trusted_proxies to match your front end. An external GKE Ingress of class gce isn’t listed: it provisions a public forwarding-rule address, which the /login private-network check rejects.| Front end | trusted_proxies |
|---|---|
| Cloud Run reached directly, no load balancer | [169.254.0.0/16] |
| Internal Application Load Balancer in front of Cloud Run | 169.254.0.0/16 plus your proxy-only subnet’s CIDR |
GKE internal Ingress, class gce-internal | Your proxy-only subnet’s CIDR |
gateway.yaml
listen:
host: 0.0.0.0
port: 8080
public_url: https://claude-gateway.internal.example.com
trusted_proxies: [169.254.0.0/16, <your-proxy-only-subnet-cidr>]
oidc:
issuer: https://accounts.google.com
client_id: <your-oauth-client-id>
client_secret: ${OIDC_CLIENT_SECRET} # GKE: ${file:/secrets/oidc-client-secret}
allowed_email_domains: [example.com]
# Google ignores offline_access; these yield refresh tokens:
scopes: [openid, profile, email]
extra_auth_params: { access_type: offline, prompt: consent }
session:
jwt_secret: ${GATEWAY_JWT_SECRET} # GKE: ${file:/secrets/jwt-secret}
store:
postgres_url: ${GATEWAY_POSTGRES_URL} # GKE: ${file:/secrets/postgres-url}
upstreams:
- provider: vertex
region: <your-region> # must match $REGION
project_id: <your-project>
auth: {} # ADC via the runtime service account
Google id_tokens carry no
groups claim. To use group-based policies in managed.policies with Google Workspace as the IdP, configure oidc.google_groups, which looks up each user’s groups through the Admin SDK Directory API using a service account with domain-wide delegation. Without it, match on email_domain instead.6
Store secrets in Secret Manager
Create four secrets and grant
How the secrets reach the container differs by track:
roles/secretmanager.secretAccessor to the claude-gateway service account:| Secret | Source |
|---|---|
gateway-jwt-secret | openssl rand -base64 32 |
gateway-oidc-client-secret | Google Cloud Console → OAuth client |
gateway-postgres-url | $GATEWAY_POSTGRES_URL from the Cloud SQL step |
gateway-config | the full gateway.yaml from the previous step |
- On GKE they mount as files via the Secret Manager CSI driver, and
gateway.yamlreferences${file:/secrets/...}. - On Cloud Run, which can’t mount multiple secrets into one directory,
gateway.yamlmounts as a file and the other three inject as environment variables, sogateway.yamlreferences${GATEWAY_JWT_SECRET},${OIDC_CLIENT_SECRET}, and${GATEWAY_POSTGRES_URL}instead.
7
Deploy
- Cloud Run
- GKE
The command below deploys for production behind an internal load balancer.Direct VPC egress, via
gcloud run deploy claude-gateway \
--image="${REGION}-docker.pkg.dev/${PROJECT_ID}/claude-gateway/gateway:<version>" \
--region="$REGION" \
--service-account="claude-gateway@${PROJECT_ID}.iam.gserviceaccount.com" \
--min-instances=1 \
--timeout=3600 \
--ingress=internal-and-cloud-load-balancing \
--network="$VPC" --subnet=cc-gateway-subnet --vpc-egress=private-ranges-only \
--set-secrets=/etc/claude/gateway.yaml=gateway-config:latest,GATEWAY_JWT_SECRET=gateway-jwt-secret:latest,OIDC_CLIENT_SECRET=gateway-oidc-client-secret:latest,GATEWAY_POSTGRES_URL=gateway-postgres-url:latest \
--no-invoker-iam-check
--network, --subnet, and --vpc-egress=private-ranges-only, lets the service reach the Cloud SQL private IP directly. Public egress to the Agent Platform endpoints and accounts.google.com goes directly to the internet rather than through the VPC, so no Cloud NAT is needed.The invoker IAM check must be open or disabled. The gateway runs its own OIDC and its clients carry no GCP token, so Cloud Run’s invoker check has to admit unauthenticated requests. The gateway’s OIDC sign-in authenticates the request once it reaches the container, with allowed_email_domains gating which domains may sign in.Two flags admit unauthenticated requests:--no-invoker-iam-check: disables the check with noallUsersbinding to manage, and works under Domain Restricted Sharing--allow-unauthenticated: grantsallUserstherun.invokerrole; use it if your organization doesn’t allow--no-invoker-iam-check
--ingress is a separate, independent layer from the invoker check; keep it set to limit the service to your corporate network.By default the Cloud Run *.run.app URL resolves to a public address, which the /login private-network check rejects. Two topologies give developers a privately resolvable hostname, and Cloud Run provisions neither for you:- Internal Application Load Balancer, the topology the deploy command above assumes: deploy with
--ingress=internal-and-cloud-load-balancing, provision an internal Application Load Balancer in front of the service with an internal DNS name and certificate, and setlisten.public_urlto that hostname. - Internal-only ingress with no load balancer: deploy with
--ingress=internaland leavelisten.public_urlas the*.run.appURL, the default in the reference assets below. For*.run.appto resolve privately, your network team must already operate a Private Service Connect endpoint for Google APIs, a Cloud DNS private zone resolving*.run.appto it, and on-premises routing to that endpoint.
<public_url>/oauth/callback before the first sign-in. Redeploy after changing public_url, because the gateway builds its public origin only from that setting and ignores X-Forwarded-Host and X-Forwarded-Proto. X-Forwarded-For is honored for client IPs only when listen.trusted_proxies is set.8
Push the gateway URL to developer machines
The gateway is now running, but developers can’t reach it from
/login until the gateway URL is on their machines. Set forceLoginMethod and forceLoginGatewayUrl in the managed settings file you deploy to each device via MDM. There is no gateway option in the login picker for a developer to select manually.Terraform reference
The reference deployment assets automate the Cloud Run track on this page; the config and image assets apply to both tracks:setup.sh: an idempotentgcloudprovisioner that walks the full Cloud Run path, from enabling APIs through the first deployterraform/: the same deployment as infrastructure-as-code, for a greenfield deploy: a targeted apply to create the Artifact Registry repo, then build and push the image, then a full applygateway.yaml.exampleand aDockerfilefor the distroless runtime image
internal, so no load balancer is required. To match this page’s production-behind-an-ALB deployment, run setup.sh with INGRESS=internal-and-cloud-load-balancing, or set the Terraform variable ingress to INGRESS_TRAFFIC_INTERNAL_LOAD_BALANCER. The artifacts also default the invoker layer to an allUsers run.invoker grant rather than --no-invoker-iam-check, the inverse of this page’s walkthrough; either works, and the choice depends on your organization’s policy constraints.
The assets are provided as working examples, not as a supported production artifact; review and adapt them to your environment.
Troubleshooting
For gateway boot and login errors, see the platform-agnostic troubleshooting table. The entries below are specific to Google Cloud.| Symptom | Cause | Fix |
|---|---|---|
Cloud Run returns 403 Forbidden before reaching the container | The invoker IAM check is still enabled | Deploy with --no-invoker-iam-check, or grant allUsers the run.invoker role with --allow-unauthenticated |
--no-invoker-iam-check rejected with invoker_iam_disabled is not currently available | Blocked by constraints/run.managed.requireInvokerIam | Use --allow-unauthenticated. If Domain Restricted Sharing via constraints/iam.allowedPolicyMemberDomains blocks that too, use the GKE track, which exposes the gateway at the network layer with no allUsers binding. |
Container manifest type … must support amd64/linux at deploy | Image was built on a non-amd64 host, or buildx emitted an OCI image index | Build with --platform=linux/amd64 --provenance=false |
| Gateway boot exits with a Postgres connection-timeout error on Cloud Run | Service isn’t attached to the VPC, or Cloud SQL has no private IP on that VPC; the store stops waiting after 5 seconds | Deploy with --network and --subnet for Direct VPC egress, and create the Cloud SQL instance with --no-assign-ip and --network pointing at the same VPC |
Agent Platform requests return 403 PERMISSION_DENIED | Runtime isn’t using the claude-gateway service account, or the model isn’t enabled in Model Garden for the project | Set --service-account on Cloud Run or bind Workload Identity on GKE, and enable each Claude model in Model Garden for the target region |
| Streaming responses cut off after a fixed duration | Front-end request timeout: the load balancer backend service behind GKE Ingress defaults to 30 seconds and Cloud Run to 300 seconds | Attach a BackendConfig with a raised timeoutSec on GKE, or deploy with --timeout=3600 on Cloud Run |
Next steps
- Configuration reference: every
gateway.yamloption, includingmanaged.policiesandtelemetry - Deployment and operations: IdP setup, health checks, JWT secret rotation, upgrades, and the security model
- Claude apps gateway overview: quickstart and connecting developers
