VOOZH about

URL: https://apify.com/macheta/ocr-structured-extractor

⇱ OCR API: Image/PDF to OCR text + structured JSON (Gemini) [DEPRECATED] Β· Apify


πŸ‘ OCR Structured Extractor (AI) β€” Image/PDF β†’ OCR Text + JSON avatar

OCR Structured Extractor (AI) β€” Image/PDF β†’ OCR Text + JSON

Deprecated

Pricing

Pay per usage

Go to Apify Store

OCR Structured Extractor (AI) β€” Image/PDF β†’ OCR Text + JSON

Deprecated

Extract OCR text and structured JSON from an image or PDF URL. Great for invoices, receipts, forms, IDs, and tables. Powered by Gemini 3 Pro.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

πŸ‘ Anass

Anass

Maintained by Community

Actor stats

1

Bookmarked

42

Total users

5

Monthly active users

4 months ago

Last modified

Share

OCR Structured Extractor (AI) β€” Image/PDF β†’ OCR Text + Structured JSON

πŸ‘ OCR Structured Extractor icon

πŸ‘ OCR Structured Extractor banner

Extract OCR text and structured fields from an image URL or PDF URL using Gemini 3 Pro (through the same proxy Worker used by the other AI actors in this repo).

Keywords (SEO)

ocr api, pdf ocr, image ocr, pdf to json, image to json, invoice ocr, receipt ocr, form extraction, document understanding, gemini ocr, structured extraction, data extraction, ai document parser, id card ocr, table extraction

How it works

  1. Downloads your file from fileUrl
  2. Sends the bytes as inlineData to models/{model}:generateContent (JSON mode)
  3. Parses the model response and outputs:
    • text: full OCR transcription
    • data: structured fields (either a default structure or your extractionSchema)

Best for

  • Invoices, receipts, and utility bills (key-value extraction)
  • Forms and screenshots (clean OCR + structured fields)
  • PDFs that mix text, tables, and images (document understanding)
  • Identity documents (IDs, passports) and card-style layouts

Input

  • fileUrl (string, required): Public URL to an image (png/jpg/webp) or a PDF
  • instructions (string, optional): Extraction instructions for the model
  • extractionSchema (object, optional): JSON object describing the structure you want in data
  • model (string, default gemini-3-pro-preview)
  • maxBytes (int, default 52428800): Max size to download (PDF inline is commonly limited to 50MB)

Supported file types

  • Images: image/png, image/jpeg, image/webp (and other image/* types if the server reports a correct MIME type)
  • Documents: application/pdf

Output

The Actor stores:

  • Dataset: one item with fileUrl, mimeType, text, and data
  • Key-value store exports:
    • ocr.json (full JSON output)
    • ocr.txt (OCR text only, if available)

Dataset item example:

{
"fileUrl":"https://example.com/invoice.pdf",
"mimeType":"application/pdf",
"model":"gemini-3-pro-preview",
"text":"Invoice #INV-1002 ...",
"data":{
"summary":"Invoice from ACME Corp for January services.",
"key_value_pairs":[
{"key":"Invoice Number","value":"INV-1002"},
{"key":"Total","value":"$1,249.00"}
]
}
}

Example input (custom schema)

{
"fileUrl":"https://example.com/receipt.jpg",
"instructions":"Extract receipt line items and totals. Return ONLY JSON.",
"extractionSchema":{
"merchant":"string",
"date":"string",
"currency":"string",
"total":"string",
"items":[
{"name":"string","qty":"string","price":"string"}
]
}
}

Prompt tips

  • For invoices/receipts: ask for merchant, invoice_number, date, currency, subtotal, tax, total, items[]
  • For IDs: ask for full_name, document_number, dob, expiry_date
  • If the document has tables, ask for rows with normalized columns

You might also like

Restaurant Menu Scraper

wedo_software/wedo-scrape-menu

AI Restaurant Menu Scraper: Extract prices, descriptions, and allergens from images, PDFs, or web pages using OCR. Turn any restaurant URL into a structured Menu API.

Image OCR Scraper

seemuapps/image-ocr-scraper

Extract text from any image. Bulk OCR for screenshots, scanned documents, receipts, signs, and photos. Supports 109 languages and outputs clean Markdown or structured JSON with bounding boxes.

Extract text from PDF

akash9078/pdf-text-extractor

Efficiently extract text content from PDF files, ideal for data processing, content analysis, and automation workflows. Supports various PDF structures and outputs clean, readable text.

πŸ‘ User avatar

Akash Kumar Naik

107

Agentic Document Extractor

solutionssmart/agentic-document-extractor-local

Extract RAG-ready chunks with provenance from PDFs, scans, images, DOCX, XLSX, PPTX, CSV, TXT, and Markdown using a local-first Apify Actor.

πŸ‘ User avatar

Solutions Smart

2

PDF Text Extractor - Bulk PDF to Text & Metadata

santamaria-automations/pdf-extractor

Extract text and metadata from any PDF URL in bulk. Get page content, author, title, creation date, and more. Detects scanned PDFs that need OCR. Perfect for document analysis, research, and compliance.

eCourts Case Scraper

codingfrontend/ecourts-case-scraper

A robust, high-performance utility designed for developer automation, data integration, and AI training. Features built-in captcha bypass, headful/headless browser execution, and proxy support to scrape eCourts data seamlessly, reliably, and at scale.

πŸ‘ User avatar

codingfrontend

42

5.0

(2)

Scout β€” Lead Enrichment + OSINT

logical_vivacity/scout

Email finder + lead enrichment + OSINT from public sources. Pass any fragment — name, email, or domain — get a verified dossier: 700+ identity sites, SMTP-validated emails, document mining, sanctions screen, domain→team discovery. $0.05 person, $0.15 domain. No API keys

πŸ‘ User avatar

Logical Vivacity

120

πŸ“‹ USPTO Patent Search β€” Claims & Prior Art

nexgendata/uspto-patent-search

Search US Patent & Trademark Office database. Extract patent titles, abstracts, claims, inventors & filing dates. Build IP research tools, prior art searches & patent analytics. Pay per patent.