👁 PDF OCR API - Document Extraction avatar

PDF OCR API - Document Extraction

Pricing

from $200.00 / 1,000 page processeds

PDF OCR API - Document Extraction

Extract text from PDFs including scanned documents. OCR processing, table extraction & structured data output. Process invoices, contracts & forms at scale.

Pricing

from $200.00 / 1,000 page processeds

Rating

0.0

(0)

Developer

👁 The Howlers

The Howlers

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

PDF OCR API

Extract text from PDF files using OCR. Supports scanned documents, images, and multi-page PDFs. Returns structured text with page numbers and confidence scores. Built by John Rippy (https://www.linkedin.com/in/johnrippy/ | https://johnrippy.link/).

Quick Start

Test with Demo Mode (free, no API key needed)

{
"demoMode":true,
"pdfUrl":""
}

Run with real data

{
"demoMode":false,
"pdfUrl":"",
"language":"eng",
"outputFormat":"json",
"detectTables":false
}

Input Parameters

Parameter	Type	Default	Required	Description
`pdfUrl`	string	-	No	URL of the PDF file to process
`pdfBase64`	string	-	No	Base64-encoded PDF content (alternative to URL)
`language`	string	`"eng"`	No	Language hint for OCR (improves accuracy)
`pageRange`	string	-	No	Pages to process (e.g., '1-5' or '1,3,5'). Leave empty for all pages.
`outputFormat`	string	`"json"`	No	How to structure the output
`detectTables`	boolean	`false`	No	Attempt to preserve table structure
`demoMode`	boolean	`true`	No	Return sample output without processing (for testing)
`webhookUrl`	string	-	No	Optional URL to receive results via POST request when actor completes

Pricing

This actor uses pay-per-event billing:

Event	Description	Price
Page Processed	Each PDF page processed with OCR	$0.02

Demo mode is free -- no charges for sample data.

Troubleshooting

"API error 429" or "Rate limit"

Too many requests. Wait a minute and try again, or reduce the number of items per run.

No results or empty dataset

Check the run log for error messages. Common causes:

Invalid input format (check the examples above)
The target data doesn't exist or is too small to track

How do I test without an API key?

Enable Demo Mode in the input. This returns realistic sample data so you can verify the output format works for your workflow.

Built by John Rippy | Actor Arsenal

👁 Pdf to json avatar

Pdf to json

shahabuddin38/pdf-to-json

Convert PDF files into structured JSON with optional OCR, table extraction, key-value detection, and metadata parsing. Ideal for invoices, receipts, contracts, statements, forms, and document automation workflows. Supports digital and scanned PDFs for API-ready data extraction.

👁 User avatar

Shahab Uddin

👁 OCR Structured Extractor (AI) — Image/PDF → OCR Text + JSON avatar

OCR Structured Extractor (AI) — Image/PDF → OCR Text + JSON

macheta/ocr-structured-extractor

Extract OCR text and structured JSON from an image or PDF URL. Great for invoices, receipts, forms, IDs, and tables. Powered by Gemini 3 Pro.

👁 User avatar

Anass

👁 Bulk Pdf To Json OCR avatar

Bulk Pdf To Json OCR

gagandeo/bulk-pdf-to-json-ocr

Convert PDF invoices, menus, images with text and documents into structured JSON. Features hybrid Digital+OCR parsing and AI-powered data extraction.

👁 User avatar

Kumar Gagandeo

👁 PDF OCR Tool — Extract Text from Scanned Documents avatar

PDF OCR Tool — Extract Text from Scanned Documents

junipr/pdf-ocr-tool

Extract text from scanned PDFs and images using Tesseract OCR. 100+ languages, multi-page support. Configurable DPI, page segmentation, language selection. Output as plain text or structured JSON per page.

👁 User avatar

junipr

Elite Document Ocr Lite

thepattyroller/elite-document-ocr-lite

Basic document text extraction and processing. Extract text from documents, analyze document structure, and extract structured data from invoices and receipts. Perfect for document automation workflows.

👁 User avatar

Logan Kiser

Ocr Pdf Extractor

vivid_astronaut/ocr-pdf-extractor

Extract text from images and PDFs using OCR. Supports multiple languages including English, Portuguese, Spanish, French, German. Uses Tesseract OCR engine with high accuracy text extraction and word-level confidence scores.

👁 User avatar

Fabio Suizu

Ocr

vivid_astronaut/ocr

Extract text from images using advanced OCR technology. Supports multiple languages and image formats. Perfect for digitizing documents, receipts, screenshots, and scanned text.

👁 User avatar

Fabio Suizu

👁 PDF Text Extractor - Bulk PDF to Text & Metadata avatar

PDF Text Extractor - Bulk PDF to Text & Metadata

santamaria-automations/pdf-extractor

Extract text and metadata from any PDF URL in bulk. Get page content, author, title, creation date, and more. Detects scanned PDFs that need OCR. Perfect for document analysis, research, and compliance.

👁 User avatar

Ale

👁 Image to Text (OCR) — Extract Text from Screenshots & Photos avatar

Image to Text (OCR) — Extract Text from Screenshots & Photos

junipr/image-to-text

Extract text from images using Tesseract.js OCR engine. Supports 100+ languages, PDFs, and bulk image processing.

👁 User avatar

junipr

👁 Document Extractor API - AI-Powered PDF & Text Analysis avatar

Document Extractor API - AI-Powered PDF & Text Analysis

fresh_cliff/document-extractor-api

Extract text and data from PDF, Word, and image documents using AI-powered OCR. Convert documents to structured JSON, analyze content, and extract insights. No API keys required with mirror fallbacks.

👁 User avatar

Brennan Crawford

URL: https://apify.com/alizarin_refrigerator-owner/pdf-ocr-api

⇱ PDF OCR text extraction API for scanned document processing · Apify

PDF OCR API - Document Extraction

PDF OCR API

Quick Start

Test with Demo Mode (free, no API key needed)

Run with real data

Input Parameters

Pricing

Troubleshooting

"API error 429" or "Rate limit"

No results or empty dataset

How do I test without an API key?

You might also like

Pdf to json

OCR Structured Extractor (AI) — Image/PDF → OCR Text + JSON

Bulk Pdf To Json OCR

PDF OCR Tool — Extract Text from Scanned Documents

Elite Document Ocr Lite

Ocr Pdf Extractor

Ocr

PDF Text Extractor - Bulk PDF to Text & Metadata

Image to Text (OCR) — Extract Text from Screenshots & Photos

Document Extractor API - AI-Powered PDF & Text Analysis