VOOZH about

URL: https://developers.openai.com/api/docs/pricing

⇱ Pricing | OpenAI API


Search the API docs

Primary navigation

Evaluation

Legacy APIs

Flagship models

Our latest models
Prices per 1M tokens.
Standard
Batch
Flex
Priority

Multimodal models

Realtime and audio generation models

Prices per 1M tokens unless noted.

Image generation models

Prices per 1M tokens.
Standard
For image generation cost estimates, use the calculator in the image generation guide.
Batch
For image generation cost estimates, use the calculator in the image generation guide.

Video generation models

Prices per second.
Standard
Batch

Transcription models

Prices per 1M tokens unless noted.

Tools

Tokens used for built-in tools are billed at the chosen model's per-token rates. GB refers to binary gigabytes (also known as gibibytes), where 1 GB is 2^30 bytes. Web search content tokens are tokens retrieved from the search index and fed to the model alongside your prompt to generate an answer. For gpt-4o-mini and gpt-4.1-mini with the non-preview web search tool, search content tokens are billed as a fixed block of 8,000 input tokens per call. File search tool call pricing applies to the Responses API only. Container pricing includes Hosted Shell and Code Interpreter. Eligible container sessions will be billed by the minute, with a 5-minute minimum per session. Responses API, Chat Completions API, Realtime API, Batch API, and Assistants API are not priced separately. Tokens are billed at the chosen model's input and output rates.

Specialized models

Prices per 1M tokens.
Standard
Batch
Priority

Finetuning

Prices per 1M tokens.

OpenAI is winding down the fine-tuning platform. The platform is no longer accessible to new users, but existing users of the fine-tuning platform will be able to create training jobs for the coming months.


All fine-tuned models will remain available for inference until their base models are deprecated. The full timeline is here.

Standard
Batch
Tokens used for model grading in reinforcement fine-tuning are billed at that model's per-token rate. Inference discounts are available if you enable data sharing when creating the fine-tune job. Learn more.

Loading docs agent...