Vision OCR MCP

Pricing

from $0.99 / 1,000 results

Try for free

Go to Apify Store

👁 Vision OCR MCP

Vision OCR MCP

Try for free

Extract text from images instantly. Turn receipts, invoices, documents, and handwritten notes into structured data.

Pricing

from $0.99 / 1,000 results

Rating

5.0

(1)

Developer

👁 Acceleration

Acceleration

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

5 months ago

Last modified

Vision OCR MCP Server

A Model Context Protocol server for extracting text from images. This server enables LLMs to read invoices, receipts, and documents in 100+ languages while preserving the original script.

About this MCP Server: To understand how to connect to and utilize this MCP server, please refer to the official Model Context Protocol documentation at mcp.apify.com.

Connection URL

MCP clients can connect to this server at:

https://accelerationengg--vision-ocr-mcp.apify.actor/mcp

Client Configuration

To connect to this MCP server, use the following configuration in your MCP client:

{
"mcpServers":{
"vision-ocr":{
"url":"https://accelerationengg--vision-ocr-mcp.apify.actor/mcp",
"headers":{
"Authorization":"Bearer YOUR_APIFY_TOKEN"
}
}
}
}

Note: Replace YOUR_APIFY_TOKEN with your actual Apify API token. You can find your token in the Apify Console.

Claude Desktop Configuration

To use this MCP server with Claude Desktop, add the following configuration to your Claude Desktop settings:

Location: ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows)

{
"mcpServers":{
"apifyVisionOCR":{
"command":"npx",
"args":[
"-y",
"mcp-remote",
"https://accelerationengg--vision-ocr-mcp.apify.actor/mcp",
"--header",
"Authorization: Bearer YOUR_APIFY_TOKEN"
]
}
}
}

Steps:

Open Claude Desktop configuration file at the location above
Add the configuration with your Apify API token (replace YOUR_APIFY_TOKEN)
Save the file
Restart Claude Desktop
The vision_ocr tool will now be available in your conversations

Available Tools

vision_ocr - Extracts structured data from images with language detection and price extraction.

Parameters:

images (array, required) - List of image URLs, file paths, or base64 strings (max 15)
output_format (string, optional) - "json" (default) or "toon" for compact output

Returns:

{
"language_detected":"ur",
"description_text":"رسید | تاریخ: ۲۰۲۶-۰۱-۰۴ | چائے",
"price_1":"₨۵۰",
"price_2":"₨۱۲۵"
}

Features

✅ Multilingual OCR - Urdu (اردو), Arabic (العربية), English, Chinese (中文), and 100+ languages
✅ Price Detection - Automatically extracts prices from invoices/receipts
✅ Layout Preservation - Maintains tables and columns with "|" separators
✅ Batch Processing - Process up to 15 images in parallel
✅ Fast - 12-15 seconds per image

Supported Formats

Images: PNG, JPG, JPEG, WEBP (GIF not supported)
Languages: 100+ including Urdu, Arabic, English, Chinese, Hindi, Spanish, French, German

Output Formats

The server supports two output formats optimized for different use cases:

JSON Format (Default)

Standard structured output - easiest to parse and integrate with applications.

Example Output:

{
"model":"Qwen/Qwen3-VL-30B",
"image_count":1,
"total_time_seconds":3.91,
"results":[
{
"index":0,
"data":{
"language_detected":"ar",
"description_text":"TURKISH CORNER Date:6/10/2019 Time:6:56 PM Table:B12 Ticket No:243 -1Homus حمص 1-Mutabel متبل 1-Baba Ghanouj بابا غنوج 1-Fatoush فتوش 1-Olive Salad سلطة زيتون 1-Green Salad سلطة خضراء 1-Grapes Leaves ورق عنب 1-Tabouleh تبولة 1-Vegetable with Youghurt Salad سلطة خضار باللبن 1-Hot Salad سلطة حارة Total: 8.00 Cash 8.00 THANK YOU",
"price_1":"8.00",
"price_2":"8.00"
},
"processing_time":3.91
}
]
}

TOON Format (Token-Efficient)

Compact notation that saves ~30% tokens - ideal for LLM processing and cost optimization.

Example Output:

model: Qwen/Qwen3-VL-30B-A3B-Instruct
image_count:1
total_time_seconds:3.76
results:
[1]{index,data,processing_time}:
0,{'language_detected':'ar','description_text':'TURKISH CORNER Date:6/10/2019 Time:6:56 PM Table:B12 Ticket No:243 -1Homus حمص 1-Mutabel متبل 1-Baba Ghanouj بابا غنوج 1-Fatoush فتوش 1-Olive Salad سلطة زيتون 1-Green Salad سلطة خضراء 1-Grapes Leaves ورق عنب 1-Tabouleh تبولة 1-Vegetable with Youghurt Salad سلطة خضار باللبن 1-Hot Salad سلطة حارة Total: 8.00 Cash 8.00 THANK YOU','price_1':'8.00','price_2':'8.00'},3.76

When to use each format:

JSON: Standard API integration, automated parsing, strict schema validation
TOON: Sending to LLMs for analysis, reducing token costs, human-readable logs

Use Cases

Financial documents: Invoices, receipts, bills
Multi-column tables: Spreadsheets, reports
Multilingual documents: Documents with Arabic, Urdu, Chinese, and other scripts
Form extraction: Structured data from forms

Python API Usage

Installation

$pip install apify-client

Basic Example

from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
# Extract text from image
run_input ={
"images":["https://example.com/receipt.jpg"],
"output_format":"json"# or "toon"
}
run = client.actor("accelerationengg/vision-ocr-mcp").call(run_input=run_input)
# Get results
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
 text = item['results'][0]['data']['description_text']
 language = item['results'][0]['data']['language_detected']
print(f"Language: {language}\nText: {text}")

Batch Processing

# Process multiple images
run_input ={
"images":[
"https://example.com/invoice1.jpg",
"https://example.com/invoice2.jpg",
"https://example.com/invoice3.jpg"
],
"output_format":"json"
}
run = client.actor("accelerationengg/vision-ocr-mcp").call(run_input=run_input)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
for result in item['results']:
print(f"Image {result['index']}: {result['data']['description_text'][:100]}...")

TOON Format for LLM Processing

# Use TOON format to save ~30% tokens
run_input ={
"images":["https://example.com/receipt.jpg"],
"output_format":"toon"
}
run = client.actor("accelerationengg/vision-ocr-mcp").call(run_input=run_input)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
 toon_output = item['content']
# Send directly to Claude for analysis
# Uses ~30% fewer tokens than JSON
# response = claude.messages.create(
# model="claude-3-5-sonnet-20241022",
# messages=[{
# "role": "user",
# "content": f"Analyze this receipt:\n{toon_output}"
# }]
# )

Example Usage

Single Image

Extract text from thisreceipt:
https://example.com/receipt.jpg

Multiple Images

Process these invoices:
- https://example.com/invoice1.jpg
- https://example.com/invoice2.jpg
- https://example.com/invoice3.jpg

Built with Qwen-VL, FastMCP, Apify

👁 AI OCR Text Extractor - High Precision Image-to-Text avatar

AI OCR Text Extractor - High Precision Image-to-Text

mikolabs/ai-ocr-text-extractor-high-precision-image-to-text

It's a high-performance solution designed to extract text from images with exceptional accuracy. Powered by industrial-grade deep learning models, it transforms unstructured image data—such as invoices, receipts, screenshots, and handwritten notes—into structured, searchable JSON data in seconds.

👁 User avatar

mikolabs

👁 OCR Structured Extractor (AI) — Image/PDF → OCR Text + JSON avatar

OCR Structured Extractor (AI) — Image/PDF → OCR Text + JSON

macheta/ocr-structured-extractor

Extract OCR text and structured JSON from an image or PDF URL. Great for invoices, receipts, forms, IDs, and tables. Powered by Gemini 3 Pro.

👁 User avatar

Anass

Ocr

vivid_astronaut/ocr

Extract text from images using advanced OCR technology. Supports multiple languages and image formats. Perfect for digitizing documents, receipts, screenshots, and scanned text.

👁 User avatar

Fabio Suizu

👁 Bulk Pdf To Json OCR avatar

Bulk Pdf To Json OCR

gagandeo/bulk-pdf-to-json-ocr

Convert PDF invoices, menus, images with text and documents into structured JSON. Features hybrid Digital+OCR parsing and AI-powered data extraction.

👁 User avatar

Kumar Gagandeo

👁 Pdf Json Extractor avatar

Pdf Json Extractor

p6t_p10n/pdf-json-extractor

Convert any PDF into structured JSON using AI and OCR (Tesseract or Google Vision). Supports custom schemas, validation, and auto-repair. Ideal for invoices, contracts, receipts, and automation workflows. Fast, accurate, and easy to integrate.

👁 User avatar

Peerapat Pongnipakorn

👁 PDF OCR API - Document Extraction avatar

PDF OCR API - Document Extraction

alizarin_refrigerator-owner/pdf-ocr-api

Extract text from PDFs including scanned documents. OCR processing, table extraction & structured data output. Process invoices, contracts & forms at scale.

👁 User avatar

The Howlers

Elite Document Ocr Lite

thepattyroller/elite-document-ocr-lite

Basic document text extraction and processing. Extract text from documents, analyze document structure, and extract structured data from invoices and receipts. Perfect for document automation workflows.

👁 User avatar

Logan Kiser

👁 Image OCR Scraper avatar

Image OCR Scraper

seemuapps/image-ocr-scraper

Extract text from any image. Bulk OCR for screenshots, scanned documents, receipts, signs, and photos. Supports 109 languages and outputs clean Markdown or structured JSON with bounding boxes.

👁 User avatar

Andrew

👁 AI OCR for Tax Documents: Invoices, Balance Sheets & Tables avatar

AI OCR for Tax Documents: Invoices, Balance Sheets & Tables

acme-ai/ocr-tax-document-ai

Extract structured data from invoices, receipts, balance sheets and tabular PDFs with AI. Returns issuer, dates, totals, taxes and tables as JSON. Upload a file or pass URLs; batch or real-time API.

👁 User avatar

Acme AI

👁 Image To Text Ai avatar

Image To Text Ai

welcoming_fireplace/image-to-text-ai

A powerful OCR tool that goes beyond standard text extraction. Powered by a Premium Vision AI model, it accurately reads handwriting, preserves table structures, and converts messy receipts or documents into structured JSON or Markdown. Supports batch processing for high-volume workflows.

👁 User avatar

Richmond Nkrumah

URL: https://apify.com/accelerationengg/vision-ocr-mcp

⇱ Vision OCR MCP · Apify

Vision OCR MCP

Vision OCR MCP Server

Connection URL

Client Configuration

Claude Desktop Configuration

Available Tools

Features

Supported Formats

Output Formats

JSON Format (Default)

TOON Format (Token-Efficient)

Use Cases

Python API Usage

Installation

Basic Example

Batch Processing

TOON Format for LLM Processing

Example Usage

Single Image

Multiple Images

You might also like

AI OCR Text Extractor - High Precision Image-to-Text

OCR Structured Extractor (AI) — Image/PDF → OCR Text + JSON

Ocr

Bulk Pdf To Json OCR

Pdf Json Extractor

PDF OCR API - Document Extraction

Elite Document Ocr Lite

Image OCR Scraper

AI OCR for Tax Documents: Invoices, Balance Sheets & Tables

Image To Text Ai