VOOZH about

URL: https://developers.openai.com/api/docs/guides/tools-image-generation

⇱ Image generation | OpenAI API


Search the API docs

Primary navigation

Evaluation

Legacy APIs

The image generation tool allows you to generate images using a text prompt, and optionally image inputs. It uses GPT Image models, including gpt-image-2, gpt-image-1.5, gpt-image-1, and gpt-image-1-mini, and automatically optimizes text inputs for improved performance.

To learn more about image generation, refer to our dedicated image generation guide.

Usage

When you include the image_generation tool in your request, the model can decide when and how to generate images as part of the conversation, using your prompt and any provided image inputs.

The image_generation_call tool call result will include a base64-encoded image.

You can provide input images using file IDs or base64 data.

To force the image generation tool call, you can set the parameter tool_choice to {"type": "image_generation"}.

Tool options

You can configure the following output options as parameters for the image generation tool:

  • Size: Image dimensions, for example, 1024 × 1024 or 1024 × 1536
  • Quality: Rendering quality, for example, low, medium, or high
  • Format: File output format
  • Compression: Compression level (0-100%) for JPEG and WebP formats
  • Background: Transparent or opaque
  • Action: Whether the request should automatically choose, generate, or edit an image

size, quality, and background support the auto option, where the model will automatically select the best option based on the prompt.

gpt-image-2 supports flexible size values that meet its resolution constraints. It doesn’t currently support transparent backgrounds, so requests with background: "transparent" fail.

For more details on available options, refer to the image generation guide.

When using the Responses API image generation tool, supported GPT Image models can choose whether to generate a new image or edit one already in the conversation. The optional action parameter controls this behavior: keep action set to auto so the model chooses whether to generate or edit, or set it to generate or edit to force that behavior. If not specified, the default is auto.

Revised prompt

When using the image generation tool, the mainline model, for example, gpt-5.5, will automatically revise your prompt for improved performance.

You can access the revised prompt in the revised_prompt field of the image generation call:

Prompting tips

Image generation works best when you use terms like draw or edit in your prompt.

For example, if you want to combine images, instead of saying combine or merge, you can say something like “edit the first image by adding this element from the second image.”

Multi-turn editing

You can iteratively edit images by referencing previous response or image IDs. This allows you to refine images across conversation turns.

Using previous response ID
Using image ID

Streaming

The image generation tool supports streaming partial images while it generates the final result. This provides faster visual feedback for users and improves perceived latency.

You can set the number of partial images (1-3) with the partial_images parameter.

Supported models

The following models support the image generation tool:

  • gpt-5.5
  • gpt-5.4-mini
  • gpt-5.4-nano
  • gpt-5.2
  • gpt-5
  • gpt-5-nano
  • o3
  • gpt-4.1
  • gpt-4.1-mini
  • gpt-4.1-nano
  • gpt-4o
  • gpt-4o-mini

The model used for the image generation process is always a GPT Image model, including gpt-image-2, gpt-image-1.5, gpt-image-1, and gpt-image-1-mini, but these models aren’t valid values for the model field in the Responses API. Use a text-capable mainline model (for example, gpt-5.5 or gpt-5) with the hosted image_generation tool.

Loading docs agent...