👁 PDF Text Extractor API - URL to Text, Per-Page, Batch avatar

PDF Text Extractor API - URL to Text, Per-Page, Batch

Pricing

from $2.00 / 1,000 page extracteds

👁 PDF Text Extractor API - URL to Text, Per-Page, Batch

PDF Text Extractor API - URL to Text, Per-Page, Batch

Turn any public PDF URL into clean text and metadata. Per-page output, batch processing, and a synchronous API mode for AI agents. Pay per page extracted, cheaper than the alternatives.

Pricing

from $2.00 / 1,000 page extracteds

Rating

0.0

(0)

Developer

👁 Jimmy A

Jimmy A

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

6 days ago

Last modified

What it does

Fetches each PDF URL (redirects followed, 60s timeout)
Extracts text page by page with line reconstruction (not one giant word soup)
Reads the document's own metadata (title, author, producer, dates) as published in the file
Outputs one structured record per document, with per-page text blocks if you want them

Use cases

RAG / AI pipelines: turn report URLs into chunks for embedding, page-aligned
Agents: call the standby endpoint as a tool - "read this PDF and answer"
Document monitoring: pair with a scheduler to extract recurring reports (filings, government publications, price lists)
Data entry automation: pull text from invoices, spec sheets, catalogs you have rights to process
Research: batch-extract paper PDFs into searchable text

Input

{
"pdfUrls":[
"https://arxiv.org/pdf/1706.03762",
"https://example.com/annual-report.pdf"
],
"perPage":true,
"maxPages":500
}

Output

{
"url":"https://arxiv.org/pdf/1706.03762",
"pageCount":15,
"pagesExtracted":15,
"truncated":false,
"metadata":{"title":null,"author":null,"producer":"pdfTeX","creationDate":"..."},
"pages":[
{"page":1,"text":"Attention Is All You Need\n..."}
]
}

Set perPage: false for a single text field per document. Failed URLs produce a record with an error field instead of killing the run.

API / Standby mode for AI agents

GET /?url=https://example.com/file.pdf&perPage=true&maxPages=50

Returns the full extraction JSON synchronously. Works as a tool for agent frameworks that support Apify actors.

Pricing

Event	Price
Actor start	$0.0005
Per page extracted	$0.002
API call (standby)	$0.02

A 40-page report costs $0.08. Comparable actors charge $0.022-0.04 per page - 10-20x more.

FAQ

Does it do OCR on scanned PDFs? Not in this version. It extracts the text layer of digital PDFs (the overwhelming majority of reports, papers, and filings). Scanned-image PDFs return empty pages; an OCR tier is planned - ask in Issues if you need it.

How are lines handled? Text items are regrouped by their position on the page, so paragraphs read naturally instead of being one long line.

Maximum size? Default cap is 500 pages per document (configurable). Very large files are limited by fetch timeout (60s).

Password-protected PDFs? Not supported. Public, unencrypted documents only.

CSV/Excel export? Every Apify dataset exports as JSON, CSV, or Excel via the platform.

PDF Text Extractor

automation-lab/pdf-text-extractor

Extract text, metadata, and page-by-page content from PDF files. Provide PDF URLs and get structured JSON with full text, per-page text, page count, author, title, creation date, and more. Export as JSON, CSV, or Excel. No browser or proxy needed.

👁 User avatar

Stas Persiianenko

👁 PDF Parser API avatar

PDF Parser API

george.the.developer/pdf-parser-api

Instant API that parses any PDF from a URL — extracts full text, page count, metadata (title, author, dates), and PDF version. Returns structured JSON. Perfect for document processing pipelines and AI agents.

👁 User avatar

George Kioko

PDF Text & Table Extractor (pdfplumber, batch URLs)

gochujang/pdf-text-extractor

Download any PDF by URL and extract clean per-page text + detected tables (as 2D arrays) + document metadata (title/author/created/modified). Powered by pdfplumber. Batch up to 50 PDFs. $0.01 per PDF + $0.0005 per page.

👁 User avatar

Hojun Lee

PDF Extractor: Structured Text + Metadata

aitoolbreakdown/atb-pdf-extractor

Point it at one or many PDF URLs. Get clean structured JSON back: full text, per-page text, title, author, page count, and word count. Ready for RAG, search, or doc automation.

👁 User avatar

AI Tool Breakdown

👁 PDF Text Extractor - Bulk PDF to Text & Metadata avatar

PDF Text Extractor - Bulk PDF to Text & Metadata

santamaria-automations/pdf-extractor

Extract text and metadata from any PDF URL in bulk. Get page content, author, title, creation date, and more. Detects scanned PDFs that need OCR. Perfect for document analysis, research, and compliance.

👁 User avatar

Ale

👁 Pdf To Text Scraper avatar

Pdf To Text Scraper

getdataforme/pdf-to-text-scraper

The Pdf To Text Scraper is an Apify Actor that efficiently extracts text from PDFs, preserving structure and supporting batch processing....

👁 User avatar

GetDataForMe

👁 PDF Toolkit — Extract Text, Metadata & Page Count avatar

PDF Toolkit — Extract Text, Metadata & Page Count

accurate_pouch/pdf-toolkit

Extract text from PDFs, read metadata (title, author, dates), count pages. Bulk processing from URLs. $0.003 per PDF.

👁 User avatar

Manchitt Sanan

👁 Pdf Text Extractor Pro avatar

Pdf Text Extractor Pro

dainty_screw/pdf-text-extractor-pro

PDF Text Extractor lets you quickly extract text from PDF files with high accuracy. Supports text chunking for AI, chatbots, and large language models (LLMs), making PDF-to-text conversion fast, clean, and ready for NLP or machine learning.

👁 User avatar

codemaster devops

5.0

👁 Fast Pdf Processor avatar

Fast Pdf Processor

contemporary_fruit/pdf-processor-actor

This API is a PDF Processing Service allowing users to upload a PDF to: Extract Text: Reads all text from the PDF and returns it as structured JSON data per page. Merge Pages: Creates a new PDF containing only the specific pages selected by the user. (260 characters)

👁 User avatar

Andric

👁 PDF Scraper avatar

PDF Scraper

onidivo/pdf-scraper

Scrape and extract text from PDF links.

👁 User avatar

Onidivo Technologies

512

URL: https://apify.com/gratifying_graph/pdf-extract-api

⇱ PDF Text Extractor API - URL to Text, Per-Page, Batch · Apify

PDF Text Extractor API - URL to Text, Per-Page, Batch

What it does

Use cases

Input

Output

API / Standby mode for AI agents

Pricing

FAQ

You might also like

PDF Text Extractor

PDF Parser API

PDF Text & Table Extractor (pdfplumber, batch URLs)

PDF Extractor: Structured Text + Metadata

PDF Text Extractor - Bulk PDF to Text & Metadata

Pdf To Text Scraper

PDF Toolkit — Extract Text, Metadata & Page Count

Pdf Text Extractor Pro

Fast Pdf Processor

PDF Scraper