VOOZH about

URL: https://apify.com/accelerationengg/vision-ocr-mcp

⇱ Vision OCR MCP · Apify


Pricing

from $0.99 / 1,000 results

Go to Apify Store

Vision OCR MCP

Extract text from images instantly. Turn receipts, invoices, documents, and handwritten notes into structured data.

Pricing

from $0.99 / 1,000 results

Rating

5.0

(1)

Developer

👁 Acceleration

Acceleration

Maintained by Community

Actor stats

0

Bookmarked

13

Total users

1

Monthly active users

5 months ago

Last modified

Categories

Share

Vision OCR MCP Server

A Model Context Protocol server for extracting text from images. This server enables LLMs to read invoices, receipts, and documents in 100+ languages while preserving the original script.

About this MCP Server: To understand how to connect to and utilize this MCP server, please refer to the official Model Context Protocol documentation at mcp.apify.com.


Connection URL

MCP clients can connect to this server at:

https://accelerationengg--vision-ocr-mcp.apify.actor/mcp

Client Configuration

To connect to this MCP server, use the following configuration in your MCP client:

{
"mcpServers":{
"vision-ocr":{
"url":"https://accelerationengg--vision-ocr-mcp.apify.actor/mcp",
"headers":{
"Authorization":"Bearer YOUR_APIFY_TOKEN"
}
}
}
}

Note: Replace YOUR_APIFY_TOKEN with your actual Apify API token. You can find your token in the Apify Console.


Claude Desktop Configuration

To use this MCP server with Claude Desktop, add the following configuration to your Claude Desktop settings:

Location: ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows)

{
"mcpServers":{
"apifyVisionOCR":{
"command":"npx",
"args":[
"-y",
"mcp-remote",
"https://accelerationengg--vision-ocr-mcp.apify.actor/mcp",
"--header",
"Authorization: Bearer YOUR_APIFY_TOKEN"
]
}
}
}

Steps:

  1. Open Claude Desktop configuration file at the location above
  2. Add the configuration with your Apify API token (replace YOUR_APIFY_TOKEN)
  3. Save the file
  4. Restart Claude Desktop
  5. The vision_ocr tool will now be available in your conversations

Available Tools

vision_ocr - Extracts structured data from images with language detection and price extraction.

Parameters:

  • images (array, required) - List of image URLs, file paths, or base64 strings (max 15)
  • output_format (string, optional) - "json" (default) or "toon" for compact output

Returns:

{
"language_detected":"ur",
"description_text":"رسید | تاریخ: ۲۰۲۶-۰۱-۰۴ | چائے",
"price_1":"₨۵۰",
"price_2":"₨۱۲۵"
}

Features

Multilingual OCR - Urdu (اردو), Arabic (العربية), English, Chinese (中文), and 100+ languages
Price Detection - Automatically extracts prices from invoices/receipts
Layout Preservation - Maintains tables and columns with "|" separators
Batch Processing - Process up to 15 images in parallel
Fast - 12-15 seconds per image


Supported Formats

Images: PNG, JPG, JPEG, WEBP (GIF not supported)
Languages: 100+ including Urdu, Arabic, English, Chinese, Hindi, Spanish, French, German

Output Formats

The server supports two output formats optimized for different use cases:

JSON Format (Default)

Standard structured output - easiest to parse and integrate with applications.

Example Output:

{
"model":"Qwen/Qwen3-VL-30B",
"image_count":1,
"total_time_seconds":3.91,
"results":[
{
"index":0,
"data":{
"language_detected":"ar",
"description_text":"TURKISH CORNER Date:6/10/2019 Time:6:56 PM Table:B12 Ticket No:243 -1Homus حمص 1-Mutabel متبل 1-Baba Ghanouj بابا غنوج 1-Fatoush فتوش 1-Olive Salad سلطة زيتون 1-Green Salad سلطة خضراء 1-Grapes Leaves ورق عنب 1-Tabouleh تبولة 1-Vegetable with Youghurt Salad سلطة خضار باللبن 1-Hot Salad سلطة حارة Total: 8.00 Cash 8.00 THANK YOU",
"price_1":"8.00",
"price_2":"8.00"
},
"processing_time":3.91
}
]
}

TOON Format (Token-Efficient)

Compact notation that saves ~30% tokens - ideal for LLM processing and cost optimization.

Example Output:

model: Qwen/Qwen3-VL-30B-A3B-Instruct
image_count:1
total_time_seconds:3.76
results:
[1]{index,data,processing_time}:
0,{'language_detected':'ar','description_text':'TURKISH CORNER Date:6/10/2019 Time:6:56 PM Table:B12 Ticket No:243 -1Homus حمص 1-Mutabel متبل 1-Baba Ghanouj بابا غنوج 1-Fatoush فتوش 1-Olive Salad سلطة زيتون 1-Green Salad سلطة خضراء 1-Grapes Leaves ورق عنب 1-Tabouleh تبولة 1-Vegetable with Youghurt Salad سلطة خضار باللبن 1-Hot Salad سلطة حارة Total: 8.00 Cash 8.00 THANK YOU','price_1':'8.00','price_2':'8.00'},3.76

When to use each format:

  • JSON: Standard API integration, automated parsing, strict schema validation
  • TOON: Sending to LLMs for analysis, reducing token costs, human-readable logs

Use Cases

  • Financial documents: Invoices, receipts, bills
  • Multi-column tables: Spreadsheets, reports
  • Multilingual documents: Documents with Arabic, Urdu, Chinese, and other scripts
  • Form extraction: Structured data from forms

Python API Usage

Installation

$pip install apify-client

Basic Example

from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
# Extract text from image
run_input ={
"images":["https://example.com/receipt.jpg"],
"output_format":"json"# or "toon"
}
run = client.actor("accelerationengg/vision-ocr-mcp").call(run_input=run_input)
# Get results
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
text = item['results'][0]['data']['description_text']
language = item['results'][0]['data']['language_detected']
print(f"Language: {language}\nText: {text}")

Batch Processing

# Process multiple images
run_input ={
"images":[
"https://example.com/invoice1.jpg",
"https://example.com/invoice2.jpg",
"https://example.com/invoice3.jpg"
],
"output_format":"json"
}
run = client.actor("accelerationengg/vision-ocr-mcp").call(run_input=run_input)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
for result in item['results']:
print(f"Image {result['index']}: {result['data']['description_text'][:100]}...")

TOON Format for LLM Processing

# Use TOON format to save ~30% tokens
run_input ={
"images":["https://example.com/receipt.jpg"],
"output_format":"toon"
}
run = client.actor("accelerationengg/vision-ocr-mcp").call(run_input=run_input)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
toon_output = item['content']
# Send directly to Claude for analysis
# Uses ~30% fewer tokens than JSON
# response = claude.messages.create(
# model="claude-3-5-sonnet-20241022",
# messages=[{
# "role": "user",
# "content": f"Analyze this receipt:\n{toon_output}"
# }]
# )

Example Usage

Single Image

Extract text from thisreceipt:
https://example.com/receipt.jpg

Multiple Images

Process these invoices:
- https://example.com/invoice1.jpg
- https://example.com/invoice2.jpg
- https://example.com/invoice3.jpg

Built with Qwen-VL, FastMCP, Apify

You might also like

AI OCR Text Extractor - High Precision Image-to-Text

mikolabs/ai-ocr-text-extractor-high-precision-image-to-text

It's a high-performance solution designed to extract text from images with exceptional accuracy. Powered by industrial-grade deep learning models, it transforms unstructured image data—such as invoices, receipts, screenshots, and handwritten notes—into structured, searchable JSON data in seconds.

OCR Structured Extractor (AI) — Image/PDF → OCR Text + JSON

macheta/ocr-structured-extractor

Extract OCR text and structured JSON from an image or PDF URL. Great for invoices, receipts, forms, IDs, and tables. Powered by Gemini 3 Pro.

Bulk Pdf To Json OCR

gagandeo/bulk-pdf-to-json-ocr

Convert PDF invoices, menus, images with text and documents into structured JSON. Features hybrid Digital+OCR parsing and AI-powered data extraction.

👁 User avatar

Kumar Gagandeo

6

Pdf Json Extractor

p6t_p10n/pdf-json-extractor

Convert any PDF into structured JSON using AI and OCR (Tesseract or Google Vision). Supports custom schemas, validation, and auto-repair. Ideal for invoices, contracts, receipts, and automation workflows. Fast, accurate, and easy to integrate.

👁 User avatar

Peerapat Pongnipakorn

2

PDF OCR API - Document Extraction

alizarin_refrigerator-owner/pdf-ocr-api

Extract text from PDFs including scanned documents. OCR processing, table extraction & structured data output. Process invoices, contracts & forms at scale.

👁 User avatar

The Howlers

17

Image OCR Scraper

seemuapps/image-ocr-scraper

Extract text from any image. Bulk OCR for screenshots, scanned documents, receipts, signs, and photos. Supports 109 languages and outputs clean Markdown or structured JSON with bounding boxes.

AI OCR for Tax Documents: Invoices, Balance Sheets & Tables

acme-ai/ocr-tax-document-ai

Extract structured data from invoices, receipts, balance sheets and tabular PDFs with AI. Returns issuer, dates, totals, taxes and tables as JSON. Upload a file or pass URLs; batch or real-time API.

Image To Text Ai

welcoming_fireplace/image-to-text-ai

A powerful OCR tool that goes beyond standard text extraction. Powered by a Premium Vision AI model, it accurately reads handwriting, preserves table structures, and converts messy receipts or documents into structured JSON or Markdown. Supports batch processing for high-volume workflows.

👁 User avatar

Richmond Nkrumah

41