Extract OCR text and structured fields from an image URL or PDF URL using Gemini 3 Pro (through the same proxy Worker used by the other AI actors in this repo).

Keywords (SEO)

ocr api, pdf ocr, image ocr, pdf to json, image to json, invoice ocr, receipt ocr, form extraction, document understanding, gemini ocr, structured extraction, data extraction, ai document parser, id card ocr, table extraction

How it works

Downloads your file from fileUrl
Sends the bytes as inlineData to models/{model}:generateContent (JSON mode)
Parses the model response and outputs:
- text: full OCR transcription
- data: structured fields (either a default structure or your extractionSchema)

Best for

Invoices, receipts, and utility bills (key-value extraction)
Forms and screenshots (clean OCR + structured fields)
PDFs that mix text, tables, and images (document understanding)
Identity documents (IDs, passports) and card-style layouts

Input

fileUrl (string, required): Public URL to an image (png/jpg/webp) or a PDF
instructions (string, optional): Extraction instructions for the model
extractionSchema (object, optional): JSON object describing the structure you want in data
model (string, default gemini-3-pro-preview)
maxBytes (int, default 52428800): Max size to download (PDF inline is commonly limited to 50MB)

Supported file types

Images: image/png, image/jpeg, image/webp (and other image/* types if the server reports a correct MIME type)
Documents: application/pdf

Output

The Actor stores:

Dataset: one item with fileUrl, mimeType, text, and data
Key-value store exports:
- ocr.json (full JSON output)
- ocr.txt (OCR text only, if available)

Dataset item example:

{
"fileUrl":"https://example.com/invoice.pdf",
"mimeType":"application/pdf",
"model":"gemini-3-pro-preview",
"text":"Invoice #INV-1002 ...",
"data":{
"summary":"Invoice from ACME Corp for January services.",
"key_value_pairs":[
{"key":"Invoice Number","value":"INV-1002"},
{"key":"Total","value":"$1,249.00"}
]
}
}

Example input (custom schema)

{
"fileUrl":"https://example.com/receipt.jpg",
"instructions":"Extract receipt line items and totals. Return ONLY JSON.",
"extractionSchema":{
"merchant":"string",
"date":"string",
"currency":"string",
"total":"string",
"items":[
{"name":"string","qty":"string","price":"string"}
]
}
}

Prompt tips

For invoices/receipts: ask for merchant, invoice_number, date, currency, subtotal, tax, total, items[]
For IDs: ask for full_name, document_number, dob, expiry_date
If the document has tables, ask for rows with normalized columns

👁 Restaurant Menu Scraper avatar

Restaurant Menu Scraper

wedo_software/wedo-scrape-menu

AI Restaurant Menu Scraper: Extract prices, descriptions, and allergens from images, PDFs, or web pages using OCR. Turn any restaurant URL into a structured Menu API.

👁 User avatar

Benjamin

161

👁 Image OCR Scraper avatar

Image OCR Scraper

seemuapps/image-ocr-scraper

Extract text from any image. Bulk OCR for screenshots, scanned documents, receipts, signs, and photos. Supports 109 languages and outputs clean Markdown or structured JSON with bounding boxes.

👁 User avatar

Andrew

Invoice & Receipt Extractor — Automated Document Data Extrac...

apricot_blackberry/invoice-receipt-extractor

Invoices and receipts → structured data. Amounts, dates, vendors, line items, tax details. Clean JSON, zero manual entry.

👁 User avatar

Creator Fusion

PDF Text Extractor

automation-lab/pdf-text-extractor

Extract text, metadata, and page-by-page content from PDF files. Provide PDF URLs and get structured JSON with full text, per-page text, page count, author, title, creation date, and more. Export as JSON, CSV, or Excel. No browser or proxy needed.

👁 User avatar

Stas Persiianenko

👁 Extract text from PDF avatar

Extract text from PDF

akash9078/pdf-text-extractor

Efficiently extract text content from PDF files, ideal for data processing, content analysis, and automation workflows. Supports various PDF structures and outputs clean, readable text.

👁 User avatar

Akash Kumar Naik

107

👁 Agentic Document Extractor avatar

Agentic Document Extractor

solutionssmart/agentic-document-extractor-local

Extract RAG-ready chunks with provenance from PDFs, scans, images, DOCX, XLSX, PPTX, CSV, TXT, and Markdown using a local-first Apify Actor.

👁 User avatar

Solutions Smart

👁 PDF Text Extractor - Bulk PDF to Text & Metadata avatar

PDF Text Extractor - Bulk PDF to Text & Metadata

santamaria-automations/pdf-extractor

Extract text and metadata from any PDF URL in bulk. Get page content, author, title, creation date, and more. Detects scanned PDFs that need OCR. Perfect for document analysis, research, and compliance.

👁 User avatar

Ale

AI Invoice Parser - Extract Receipt & Bill Data

ntriqpro/invoice-extraction-mcp

Automatically read invoices and receipts. Extract amounts, dates, and line items into structured data.

👁 User avatar

daehwan kim

👁 eCourts Case Scraper avatar

eCourts Case Scraper

codingfrontend/ecourts-case-scraper

A robust, high-performance utility designed for developer automation, data integration, and AI training. Features built-in captcha bypass, headful/headless browser execution, and proxy support to scrape eCourts data seamlessly, reliably, and at scale.

👁 User avatar

codingfrontend

5.0

(2)

👁 Scout — Lead Enrichment + OSINT avatar

Scout — Lead Enrichment + OSINT

logical_vivacity/scout

Email finder + lead enrichment + OSINT from public sources. Pass any fragment — name, email, or domain — get a verified dossier: 700+ identity sites, SMTP-validated emails, document mining, sanctions screen, domain→team discovery. $0.05 person, $0.15 domain. No API keys

👁 User avatar

Logical Vivacity

120

👁 📋 USPTO Patent Search — Claims & Prior Art avatar

📋 USPTO Patent Search — Claims & Prior Art

nexgendata/uspto-patent-search

Search US Patent & Trademark Office database. Extract patent titles, abstracts, claims, inventors & filing dates. Build IP research tools, prior art searches & patent analytics. Pay per patent.

👁 User avatar

NexGenData

URL: https://apify.com/macheta/ocr-structured-extractor

⇱ OCR API: Image/PDF to OCR text + structured JSON (Gemini) [DEPRECATED] · Apify

OCR Structured Extractor (AI) — Image/PDF → OCR Text + JSON

OCR Structured Extractor (AI) — Image/PDF → OCR Text + Structured JSON

Keywords (SEO)

How it works

Best for

Input

Supported file types

Output

Example input (custom schema)

Prompt tips

You might also like

Restaurant Menu Scraper

Image OCR Scraper

Invoice & Receipt Extractor — Automated Document Data Extrac...

PDF Text Extractor

Extract text from PDF

Agentic Document Extractor

PDF Text Extractor - Bulk PDF to Text & Metadata

AI Invoice Parser - Extract Receipt & Bill Data

eCourts Case Scraper

Scout — Lead Enrichment + OSINT

📋 USPTO Patent Search — Claims & Prior Art