Flagship models
Our latest models
Prices per 1M tokens.Standard
Batch
Flex
Priority
Multimodal models
Realtime and audio generation models
Prices per 1M tokens unless noted.
Image generation models
Standard
For image generation cost estimates, use the calculator in the image generation guide.
Batch
For image generation cost estimates, use the calculator in the image generation guide.
Video generation models
Standard
Batch
Transcription models
Prices per 1M tokens unless noted.
Tools
Tokens used for built-in tools are billed at the chosen model's per-token rates. GB refers to binary gigabytes (also known as gibibytes), where 1 GB is 2^30 bytes. Web search content tokens are tokens retrieved from the search index and fed to the model alongside your prompt to generate an answer. For
gpt-4o-mini and gpt-4.1-mini with the non-preview web search tool, search content tokens are billed as a fixed block of 8,000 input tokens per call. File search tool call pricing applies to the Responses API only. Container pricing includes Hosted Shell and Code Interpreter. Eligible container sessions will be billed by the minute, with a 5-minute minimum per session. Responses API, Chat Completions API, Realtime API, Batch API, and Assistants API are not priced separately. Tokens are billed at the chosen model's input and output rates.Specialized models
Standard
Batch
Priority
Finetuning
OpenAI is winding down the fine-tuning platform. The platform is no longer accessible to new users, but existing users of the fine-tuning platform will be able to create training jobs for the coming months.
All fine-tuned models will remain available for inference until their base models are deprecated. The full timeline is here.
Standard
Batch
Tokens used for model grading in reinforcement fine-tuning are billed at that model's per-token rate. Inference discounts are available if you enable data sharing when creating the fine-tune job. Learn more.
