VOOZH about

URL: https://apify.com/seemuapps/image-captioner

โ‡ฑ AI Image Captioner ยท Apify


Pricing

from $12.00 / 1,000 results

Go to Apify Store

Generate accurate text descriptions for any image using AI โ€” bulk caption product photos, screenshots, or any image URL for SEO, accessibility, and content tagging.

Pricing

from $12.00 / 1,000 results

Rating

0.0

(0)

Developer

๐Ÿ‘ Andrew

Andrew

Maintained by Community

Actor stats

0

Bookmarked

5

Total users

1

Monthly active users

11 days ago

Last modified

Share

Generate accurate, detailed text descriptions for any image using AI โ€” bulk caption product photos, screenshots, or any image URL for SEO alt text, accessibility compliance, and content tagging.

What you get

  • Natural-language captions generated by Molmo 2, trained on 712,000+ human-described images
  • Three detail levels: brief one-liner, balanced description, or full detailed paragraph
  • Optional focus directive to target specific aspects (text, background, faces, objects, etc.)
  • One output record per image with the caption and source URL
  • Supports bulk processing โ€” pass up to 50 image URLs and get all captions in a single run
  • Export to JSON or CSV directly from the Apify console

Use cases

  • E-commerce SEO โ€” generate alt text for thousands of product images automatically
  • Accessibility compliance โ€” add descriptive alt text to images on websites and apps
  • Content moderation โ€” understand what's in user-uploaded images before publishing
  • Dataset labeling โ€” annotate image datasets for machine learning pipelines
  • Digital asset management โ€” auto-tag and describe photos in large media libraries
  • Social media monitoring โ€” caption scraped images to make them searchable by content

Examples

ImageDetail LevelCaption
๐Ÿ‘ Nike sneaker on red background
HighA vibrant red Nike sneaker takes center stage in this striking advertisement, set against a bold red background that creates a visually cohesive and eye-catching composition. The shoe is positioned at an angle, giving the impression of motion and energy. The sneaker features a white Nike swoosh, darker red laces, and "Nike Free" branding on the white sole. The lighting is bright and even, highlighting the shoe's textures and details.
๐Ÿ‘ Nike sneaker on red background
MediumA vibrant red Nike sneaker is displayed against a matching red background. The shoe features a white Nike swoosh and "Nike Free" branding on the sole. The laces are a darker shade of red, complementing the overall design.
๐Ÿ‘ Nike sneaker on red background
LowA red Nike sneaker with white accents.
๐Ÿ‘ Food flatlay with three dishes
HighA top-down view of a rustic wooden table with three round bowls arranged in a triangular formation. The central bowl features slices of medium-rare steak garnished with fresh green leaves and a red chili pepper. The left bowl holds crispy fried fish topped with a creamy sauce and herbs. The right bowl contains a meat dish garnished with thinly sliced red onions and nuts. Scattered around the bowls are whole chili peppers, cashews, and a small bowl of brown dipping sauce.
๐Ÿ‘ Food flatlay with three dishes
MediumThree bowls of food are arranged on a gray wooden table, creating a rustic dining scene. The central bowl contains sliced steak, while the left bowl holds fried fish topped with sauce and herbs. The right bowl features a meat dish garnished with onions and nuts.
๐Ÿ‘ Food flatlay with three dishes
LowThree bowls of food on a wooden table with garnishes.

How to use

  1. Paste one or more image URLs into the Images field (or upload files directly)
  2. Choose a Detail Level โ€” High gives the most descriptive output (recommended for SEO and accessibility)
  3. Optionally add a Focus hint to direct the model's attention (e.g. "describe only the text visible")
  4. Click Run โ€” captions appear in the Dataset tab when complete
  5. Export results as JSON or CSV, or connect to downstream actors via the Apify API

Output format

Each dataset record:

{
"inputImageUrl":"https://example.com/product.jpg",
"caption":"A white ceramic coffee mug sitting on a wooden table next to an open laptop. The mug has a minimalist logo on the front and steam rising from the top, suggesting the coffee is hot.",
"detailLevel":"high",
"status":"success",
"error":null
}

Input options

FieldTypeDescription
ImagesURL listOne or more http/https image URLs or base64 data URIs
Upload ImagesFile uploadUpload images directly from your computer
Detail LevelSelectLow (one-liner), Medium (balanced), High (detailed paragraph) โ€” default: High
FocusTextOptional directive to focus the caption on a specific aspect of the image

Limits

  • Maximum 50 images per run
  • Each image must be a publicly accessible URL or a base64 data URI
  • Processing time is typically 5โ€“15 seconds per image

Related AI image actors

Part of a complete AI image toolkit โ€” explore the rest of the suite:

You might also like

image to image

evoort-solutions-llc/image-to-image

Evoort Solutions LLC

8

Image to Prompt Generator ๐ŸŽจ โœจ

easyapi/image-to-prompt-generator

Transform any image into detailed text descriptions using AI. Perfect for content creators, SEO specialists, and developers who need automated image-to-text conversion. Powered by Phot.ai's advanced image recognition technology.

AI Image Intelligence

marielise.dev/ai-image-intelligence

Make every image work harder for your business. Auto-generate SEO-optimized metadata, accessibility-compliant alt text, and rich descriptions using AI. Perfect for e-commerce, content sites, and stock agencies processing hundreds of images daily. $0.01/image.

Image To Text

calm_necessity/image-to-text

Image to Text Actor analyzes images and generates detailed text descriptions of scenes, objects, and visual context. Upload an image and receive a human-readable explanation of what the image contains. Ideal for accessibility, content understanding, and automation workflows.

๐Ÿ‘ User avatar

Taher Ali Badnawarwala

2

Image Scraper

rapidtech1898/image-scraper

Extract image links from any website quickly and easily. Enter a URL and the scraper collects all available image URLs in seconds. Perfect for designers, marketers, and developers who need fast access to image sources without manual searching.

103

1.0

FLUX.2 Klein Image Generator (Text-to-Image & Image-to-Image)

danitn11/flux2-klein-image-generator

Fast, cheap AI image generator & editor powered by FLUX.2 Klein. Text-to-image and image-to-image in seconds, just $4/1000 images. No GPU or subscription โ€” a pay-as-you-go Midjourney, DALL-E & Flux alternative.

AI Image Generator - Text to Image

akash9078/ai-image-generator

Generate stunning AI images from text prompts. Create high-quality images using advanced AI models with support for multiple aspect ratios and customizable settings.

๐Ÿ‘ User avatar

Akash Kumar Naik

178

Alt Text Generator - Batch Image Descriptions

ntriqpro/alt-text-batch

Automatically generate detailed descriptions for multiple images at once. Perfect for accessibility, SEO, and social media captions.

Related articles

Top 5 Google Image Search APIs to extract web image data
Read more