VOOZH about

URL: https://www.morphllm.com/codex-provider-configuration

⇱ Codex config.toml (2026): Add Any Custom Provider in 6 Lines (OpenRouter, Azure, Ollama, DeepSeek)


Codex config.toml (2026): Add Any Custom Provider in 6 Lines (OpenRouter, Azure, Ollama, DeepSeek)

Complete Codex config.toml reference: the 6-line model_providers block, why wire_api = "responses" is now the only option, file precedence, profiles, and copy-paste configs for OpenRouter, Azure, Ollama, DeepSeek, and Morph.

June 9, 2026 · 1 min read

Last updated June 2026

6
lines for a custom provider
"responses"
only supported wire_api value
$0.28/M
DeepSeek V4 Flash output via API key

Drop-in ~/.codex/config.toml

model = "your-model-id"
model_provider = "custom_provider"

[model_providers.custom_provider]
name = "Custom Provider"
base_url = "https://api.example.com/v1"
env_key = "CUSTOM_API_KEY"
wire_api = "responses"
query_params = { version = "2024" }
http_headers = { "User-Agent" = "Codex" }

Where config.toml Lives and What Wins

Codex CLI loads configuration from up to three layers. When two layers set the same key, the higher layer wins:

  1. CLI flags: -c key=value and --model override everything for one invocation
  2. Profile file: $CODEX_HOME/<profile-name>.config.toml, activated with --profile <name>
  3. Project config: .codex/config.toml in the repo root, loaded only when the project is trusted
  4. User config: ~/.codex/config.toml (or $CODEX_HOME/config.toml if CODEX_HOME is set)

Project config cannot change your provider

A repo-level .codex/config.toml cannot override provider, auth, notification, telemetry, or profile-selection keys. This is a security boundary: cloning a repo cannot silently redirect your prompts and API key to someone else's endpoint. Provider definitions belong in the user-level file.

Check which config Codex is reading

# Default location
cat ~/.codex/config.toml

# If you set CODEX_HOME, that directory wins
echo $CODEX_HOME

# Per-invocation override beats every file
codex -c model='"gpt-5.4"' "explain this module"

The 6-Line Custom Provider Block

Two keys at the top of config.toml select what runs: model picks the model ID, model_provider picks which provider definition to use. Definitions live under [model_providers.<id>]:

~/.codex/config.toml: complete custom provider

model = "your-model-id"
model_provider = "custom_provider"

[model_providers.custom_provider]
name = "Custom Provider"
base_url = "https://api.example.com/v1"
env_key = "CUSTOM_API_KEY"
wire_api = "responses"
query_params = { version = "2024" }
http_headers = { "User-Agent" = "Codex" }

Only base_url is required for most setups. The full key schema, with defaults from the official configuration reference:

KeyTypeDefaultWhat it does
namestring(none)Display name shown in the CLI
base_urlstring(required)API endpoint URL
env_keystring(none)Env var holding the API key, sent as Bearer token
experimental_bearer_tokenstring(none)Inline API key in the TOML; discouraged, prefer env_key
requires_openai_authboolfalseUse ChatGPT sign-in; only valid for OpenAI's own provider
wire_apistring"responses"Protocol; "responses" is the only supported value
query_paramsmap(none)Query params appended to requests (Azure api-version)
http_headersmap(none)Static headers added to every request
env_http_headersmap(none)Headers populated from env vars at runtime
request_max_retriesnumber4HTTP retry count
stream_max_retriesnumber5SSE stream retry count
stream_idle_timeout_msnumber300000SSE idle timeout (5 minutes)

wire_api: Only "responses" Works Now

This is the single biggest source of broken Codex provider configs in 2026. Guides written in 2025 show wire_api = "chat" for OpenRouter, Ollama, and every other third-party endpoint. The current configuration reference is explicit: responses is the only supported value, and it is the default when omitted.

Practical consequences:

  • You can delete wire_api from your provider blocks entirely; the default is correct.
  • Any endpoint you point Codex at must speak the OpenAI Responses API, not just Chat Completions.
  • For providers that only expose Chat Completions, put a translating gateway (LiteLLM, or a router like OpenRouter that exposes a Responses-compatible surface) between Codex and the provider.

Symptom of a wire_api mismatch

If the endpoint behind base_url only implements Chat Completions, Codex's Responses-API requests hit a route that does not exist. You see 404s, "unknown endpoint" errors, or empty streams immediately after the session starts, even though a curl to /v1/chat/completions on the same host works fine. The host is up; the protocol is wrong.

Copy-Paste Provider Configs

Each block is a complete ~/.codex/config.toml. wire_api is omitted because "responses" is the default. The Responses-compatibility requirement from the previous section applies to every endpoint here.

OpenRouter

One API key, hundreds of models, unified billing.

OpenRouter provider

model = "anthropic/claude-sonnet-4.6"
model_provider = "openrouter"

[model_providers.openrouter]
name = "OpenRouter"
base_url = "https://openrouter.ai/api/v1"
env_key = "OPENROUTER_API_KEY"

Azure OpenAI

Azure is the canonical use case for query_params: it versions its API through an api-version query parameter instead of the URL path.

Azure OpenAI provider

model = "gpt-5.3-codex"
model_provider = "azure"

[model_providers.azure]
name = "Azure OpenAI"
base_url = "https://YOUR_RESOURCE.openai.azure.com/openai"
env_key = "AZURE_OPENAI_API_KEY"
# Match the api-version your Azure resource is deployed with
query_params = { api-version = "preview" }

Ollama (local, http://localhost:11434/v1)

Zero API cost. Pull the model first (ollama pull qwen3.5-coder), then point base_url at the local server. No env_key needed for localhost.

Ollama provider

model = "qwen3.5-coder"
model_provider = "ollama"

[model_providers.ollama]
name = "Ollama"
base_url = "http://localhost:11434/v1"

If your Ollama version serves only Chat Completions, Codex's Responses-API calls will 404 against it; upgrade Ollama or front it with a translating proxy. Codex also needs tool calling, so small local models that lack tool support will connect but never edit a file.

LM Studio (local)

LM Studio provider

model = "qwen3.5-27b"
model_provider = "lmstudio"

[model_providers.lmstudio]
name = "LM Studio"
base_url = "http://localhost:1234/v1"

DeepSeek

The price outlier: DeepSeek V4 Flash costs $0.14/M input (cache miss), $0.0028/M on cache hits, and $0.28/M output, with a 1M-token context window. That is 50x cheaper output than gpt-5.3-codex's $14/M. The legacy deepseek-chat ID currently maps to V4 Flash non-thinking mode and retires after July 24, 2026; use the explicit V4 IDs going forward.

DeepSeek provider

model = "deepseek-chat" # maps to deepseek-v4-flash; legacy ID retires after 2026-07-24
model_provider = "deepseek"

[model_providers.deepseek]
name = "DeepSeek"
base_url = "https://api.deepseek.com"
env_key = "DEEPSEEK_API_KEY"

DeepSeek's API documents OpenAI Chat Completions and Anthropic formats. If its endpoint rejects Responses-API calls from your Codex build, route through a gateway that translates.

Morph

Two ways to use Morph in Codex: as a cheap chat provider behind a gateway, or as a zero-latency edit_file MCP tool.

As a chat provider. Morph serves an OpenAI-compatible Chat Completions API at https://api.morphllm.com/v1 (env MORPH_API_KEY), with morph-dsv4flash at a 1M-token context window (alternates: morph-qwen35-397b at 262k, morph-minimax27-230b, morph-qwen36-27b). Morph exposes Chat Completions only and has no /v1/responses endpoint, so the raw block below does not work natively because Codex speaks the Responses API. Like DeepSeek and Ollama above, point base_url at a Responses-translating gateway (LiteLLM or an OpenRouter-style router) that fronts Morph:

Morph as a chat provider (needs a Responses gateway)

model = "morph-dsv4flash" # 1M context; via a Responses-translating gateway
model_provider = "morph"

[model_providers.morph]
name = "Morph"
base_url = "https://api.morphllm.com/v1" # Chat Completions only; front with a gateway
env_key = "MORPH_API_KEY"

As an MCP tool. Morph's apply models are specialized code-edit appliers. The @morphllm/morphmcp server gives Codex an edit_file tool backed by morph-v3-fast at 10,500+ tok/s, which offloads bulk edits from your main model's output tokens. This path needs no gateway. MCP servers are configured in the same config.toml:

Morph MCP server in config.toml

[mcp_servers.morph]
command = "npx"
args = ["-y", "@morphllm/morphmcp"]
env = { "MORPH_API_KEY" = "your-morph-api-key" }

See Morph pricing for apply-model rates.

Authentication: env_key and the auth Table

Three auth paths exist, and they do not stack:

  • ChatGPT sign-in: the default for OpenAI's own provider. Usage draws from your plan's 5-hour-window message limits (Plus: 15-80 GPT-5.5 messages, 20-100 GPT-5.4). Custom providers cannot use this.
  • env_key: the standard path for custom providers. Codex reads the named environment variable at runtime and sends its value as a Bearer token. Keys never go in the TOML file (an experimental_bearer_token field exists but is discouraged).
  • [model_providers.<id>.auth]: command-backed token generation for short-lived credentials. Codex runs command with args and uses the output as the token, refreshing on refresh_interval_ms. The reference forbids combining auth with env_key, experimental_bearer_token, or requires_openai_auth.
Auth methodTOML key(s)Use whenCannot combine with
ChatGPT sign-inrequires_openai_auth = trueOpenAI's own provider, drawing on plan credits[auth] table
Env var (recommended)env_keyCustom providers; key stays out of the file[auth] table
Inline token (discouraged)experimental_bearer_tokenYou cannot set an env var[auth] table
Command-backed[auth] table (command, args)Short-lived or rotating tokensenv_key, experimental_bearer_token, requires_openai_auth

experimental_bearer_token puts the API key directly in config.toml. Use it only when you cannot set an environment variable; env_key is the recommended path because the key stays out of the file:

Inline token with experimental_bearer_token

model = "your-model-id"
model_provider = "custom_provider"

[model_providers.custom_provider]
name = "Custom Provider"
base_url = "https://api.example.com/v1"
experimental_bearer_token = "sk-..."

Command-backed auth for rotating tokens

[model_providers.internal]
name = "Internal Gateway"
base_url = "https://llm-gateway.internal.example.com/v1"

[model_providers.internal.auth]
command = "get-llm-token"
args = ["--scope", "codex"]
timeout_ms = 5000
refresh_interval_ms = 300000

Wire up the API key for env_key providers

# Set and persist the key the provider block names in env_key
export OPENROUTER_API_KEY="sk-or-v1-..."
echo 'export OPENROUTER_API_KEY="sk-or-v1-..."' >> ~/.zshrc

# Verify Codex can see it (must print the key, not a blank line)
echo $OPENROUTER_API_KEY

Profiles: One File per Provider

Profiles are per-session config overlays. Each profile is its own file at $CODEX_HOME/<profile-name>.config.toml, next to the main config.toml, and is activated with --profile <name>. Whatever keys the profile sets override the main config for that session; everything else falls through.

~/.codex/deepseek.config.toml

model = "deepseek-chat"
model_provider = "deepseek"

[model_providers.deepseek]
name = "DeepSeek"
base_url = "https://api.deepseek.com"
env_key = "DEEPSEEK_API_KEY"

~/.codex/local.config.toml

model = "qwen3.5-coder"
model_provider = "ollama"

[model_providers.ollama]
name = "Ollama"
base_url = "http://localhost:11434/v1"

Switch per session

# Default config.toml (OpenAI)
codex "fix the failing tests"

# Cheap bulk work via DeepSeek
codex --profile deepseek "rename userId to user_id across the repo"

# Offline / airgapped
codex --profile local "write unit tests for utils.ts"

# One-off override without any profile
codex -c model_provider='"openrouter"' -c model='"anthropic/claude-sonnet-4.6"' "review this PR"

Reasoning, Approval, and Sandbox Keys

The same config.toml controls model behavior and execution safety. The three most-searched key groups:

KeyAllowed valuesNotes
modelany model ID stringe.g. model = "gpt-5.5"
model_reasoning_effortminimal | low | medium | high | xhighReasoning depth for reasoning-capable models
model_reasoning_summaryauto | concise | detailed | noneHow reasoning is summarized in output
approval_policyuntrusted | on-request | neverWhen Codex asks before running commands
sandbox_moderead-only | workspace-write | danger-full-accessFilesystem/network blast radius

A complete daily-driver config.toml

model = "gpt-5.5"
model_provider = "openai"
model_reasoning_effort = "high"
approval_policy = "on-request"
sandbox_mode = "workspace-write"

approval_policy also accepts a granular table with boolean flags (sandbox_approval, rules, mcp_elicitations, request_permissions, skill_approval) when the three presets are too coarse.

Custom Provider vs ChatGPT Plan: The Cost Math

Codex has two billing modes. ChatGPT sign-in draws from plan credits with 5-hour-window limits. An API key (OpenAI's own or any custom provider) bills per token with no message caps. Which is cheaper depends entirely on volume:

RouteCostLimits
ChatGPT Free / Go$0 / $8 per monthLowest 5-hour-window message limits
ChatGPT Plus$20/mo15-80 GPT-5.5 msgs or 20-100 GPT-5.4 msgs per 5h window
ChatGPT Profrom $100/mo5x or 20x the Plus limits
API key: gpt-5.3-codex$1.75/M in, $14/M outNo message caps; pay per token
API key: gpt-5.4$2.50/M in, $15/M outNo message caps
Custom provider: DeepSeek V4 Flash$0.14/M in, $0.28/M out1M context; 50x cheaper output than gpt-5.3-codex
Custom provider: MiniMax M3$0.30/M in, $1.20/M out1M context, 80.5% SWE-bench Verified
Custom provider: Ollama local$0Your hardware; model must support tool calling

On the plan side, OpenAI publishes exact credit burn rates per million tokens: GPT-5.5 costs 125 credits input / 750 output, GPT-5.4 costs 62.5/375, and GPT-5.3-Codex costs 43.75/350. A single GPT-5.5 message averages 5-45 credits. Heavy agentic sessions chew through a Plus window fast, which is why the custom-provider block exists: route bulk refactors to a $0.28/M-output model and save the plan credits for planning. Full plan-by-plan analysis in Codex pricing.

Troubleshooting Exact Errors

401 Unauthorized from a custom provider

Codex sends the value of the env_key variable as a Bearer token. A 401 means that variable is empty in the shell that launched Codex, or the key is invalid for that endpoint.

Debug a 401

# 1. Is the variable exported? (blank line = your problem)
echo $DEEPSEEK_API_KEY

# 2. Does the key work outside Codex?
curl -s https://api.deepseek.com/models -H "Authorization: Bearer $DEEPSEEK_API_KEY"

# 3. Did you accidentally define BOTH env_key and an [auth] table?
# The config reference forbids combining them. Pick one.

404 / unknown endpoint right after session start

Protocol mismatch. Codex speaks the Responses API; the endpoint behind base_url only implements Chat Completions. A curl to /v1/chat/completions succeeding while Codex fails is the giveaway. Fix: use an endpoint with Responses support or a translating gateway. Also check for a doubled or missing /v1 in base_url.

Model not found

The model string must match the provider's exact ID format: OpenRouter uses vendor/model-name (e.g. anthropic/claude-sonnet-4.6), Ollama uses the local tag you pulled, OpenAI uses bare IDs like gpt-5.3-codex. A model ID valid on one provider 404s on another.

Stream stalls, then dies

stream_idle_timeout_ms defaults to 300000 (5 minutes). Large local models can exceed that before the first token. Raise it per provider:

Longer timeout for slow local models

[model_providers.ollama]
name = "Ollama"
base_url = "http://localhost:11434/v1"
stream_idle_timeout_ms = 600000 # 10 minutes
request_max_retries = 2

Connects, chats, but never edits a file

Codex drives everything through tool calls: reading files, running commands, applying edits. Models without tool-calling support will answer in prose and touch nothing. Use models that implement the OpenAI tool spec; among open weights, DeepSeek V4, Qwen3.5 coder variants, and MiniMax M3 all support tool calls.

FAQ

Where is the Codex config.toml file located?

User-level config lives at ~/.codex/config.toml (or $CODEX_HOME/config.toml if CODEX_HOME is set). A project-scoped .codex/config.toml in the repo root is loaded only when you have marked the project as trusted, and it cannot override provider, auth, notification, telemetry, or profile-selection keys. CLI -c flags override both for a single invocation.

Does Codex CLI have a --provider flag?

No. Provider selection uses the model_provider key in config.toml, a per-invocation override like codex -c model_provider="openrouter", or a profile selected with codex --profile <name>. The provider itself is defined under [model_providers.<id>] with base_url and env_key.

Does Codex still support wire_api = "chat"?

No. The current Codex configuration reference states that "responses" is the only supported value for wire_api, and it is the default when omitted. Configs copied from 2025 tutorials that set wire_api = "chat" target removed behavior. Providers that only expose a Chat Completions endpoint need a translating gateway (LiteLLM or similar) in front, or a native Responses-compatible endpoint.

Can Codex use Ollama at http://localhost:11434/v1?

Define [model_providers.ollama] with base_url = "http://localhost:11434/v1" and set model to a model you have pulled locally. No env_key is needed for a local server. Because current Codex builds speak only the Responses API, your Ollama version must serve a Responses-compatible endpoint; if it only serves Chat Completions, put a translating proxy in front.

What is the Codex equivalent of cursor.general.customEndpoint in settings.json?

Cursor points at a custom endpoint (for example http://localhost:11434/v1) via the cursor.general.customEndpoint key in settings.json. The Codex equivalent is the base_url key inside a [model_providers.<id>] table in ~/.codex/config.toml, plus model_provider = "<id>" to activate it.

How do I switch Codex between providers per session?

Two ways. Profiles: create $CODEX_HOME/<name>.config.toml next to config.toml and run codex --profile <name>. One-off overrides: codex -c model_provider='"openrouter"' -c model='"<model-id>"' applies for a single invocation and beats every config file.

Why does my Codex custom provider return 401 Unauthorized?

Codex reads the API key from the environment variable named in env_key and sends it as a Bearer token. A 401 almost always means that variable is unset or not exported in the shell that launched Codex. Run echo on the variable to check. Also: do not combine the [model_providers.<id>.auth] command table with env_key or experimental_bearer_token; the reference forbids mixing them.

Is custom-provider usage billed against my ChatGPT plan?

No. ChatGPT-plan credits (Plus: 15-80 GPT-5.5 messages per 5-hour window) apply only when you sign in with ChatGPT and use OpenAI's own provider. A custom provider with an API key bills at that provider's token rates: $1.75/M input and $14/M output for gpt-5.3-codex, $0.14/M input and $0.28/M output for DeepSeek V4 Flash.

What is experimental_bearer_token in Codex config.toml?

experimental_bearer_token is a key inside a [model_providers.<id>] table that holds the API key inline in the TOML file, sent as a Bearer token. It is discouraged: prefer env_key, which names an environment variable so the key never lives in the file. Use experimental_bearer_token only when you cannot set an environment variable. It cannot be combined with the [auth] command table.

Can I use Morph with Codex CLI?

Two ways. As a chat provider: Morph serves an OpenAI-compatible Chat Completions API at https://api.morphllm.com/v1 (env MORPH_API_KEY, model morph-dsv4flash with 1M context). Because Morph exposes Chat Completions only and has no /v1/responses endpoint, while Codex speaks the Responses API, you must front Morph with a Responses-translating gateway (LiteLLM or an OpenRouter-style router); a raw [model_providers.morph] block does not work natively. As an MCP tool: add @morphllm/morphmcp under [mcp_servers.morph] for a zero-latency edit_file tool backed by morph-v3-fast at 10,500+ tok/s, which needs no gateway.

Related Guides

Speed Up Code Edits in Codex

Morph applies code edits at 10,500+ tok/s. Add the Morph MCP server to config.toml and stop burning frontier-model output tokens on mechanical edits.