PDF Scraper

Pricing

$20.00/month + usage

Try for free

Go to Apify Store

👁 PDF Scraper

PDF Scraper

Try for free

Scrape and extract text from PDF links.

Pricing

$20.00/month + usage

Rating

0.0

(0)

Developer

👁 Onidivo Technologies

Onidivo Technologies

Maintained by Community

Actor stats

Bookmarked

512

Total users

Monthly active users

a year ago

Last modified

Features

Scrape multiple files
Save the file and extracted text to the key-value store
Want more? Let us know here

Cost of usage

When running the actor with memory of 2048 MB and using datacenter proxies, average consumption is $4-8 for 1000 middle sized files.

Bugs, issues, features, and feedback

You can report issues on the Actor tab "Issues" or here and discuss or leave your feedback here.

Input

You can provide input either through the editor on the Apify platform or as a JSON object.

The only mandatory field you need to provide is the PDF URLs (pdfUrls).

An example of minimal input:

{
"pdfUrl":[
{
"url":"http://www.pdf995.com/samples/pdf.pdf"
}
],
"proxyConfiguration":{
"useApifyProxy":true
}
}

We recommend using the proxies to overcome blocking and detection if required.

Output

The extracted text is saved to the dataset, and it looks like this:

[
{
"pdfUrl":"http://www.pdf995.com/samples/pdf.pdf",
"extractedText":"\n\n\n\n\n\n\n\n\nThe pdf995 suite of products - Pdf995, PdfEdit995, and Signature995 - is a complete solution for your document publishing needs. It provides ease of use, flexibility in format, and industry-standard security- and all at no cost to you.\nPdf995 makes it easy and affordable to create professional-quality documents in the popular PDF file format. Its easy-to-use interface helps you to create PDF files by simply selecting the \"print\" command from any application, creating documents which can be viewed on any computer with a PDF viewer. Pdf995 supports network file saving, fast user switching on XP, Citrix/Terminal Server, custom page sizes and large format printing. Pdf995 is a printer...",
"extractedTextFileUrl":""
}
]

👁 PDF Extractor 2.0 avatar

PDF Extractor 2.0

jupri/pdf-extractor-2-0

💫 Extract PDF Document Contents including Metadata, Images, Pages, Tables, Attachments, etc.

👁 User avatar

cat

173

👁 PDF Text Extractor avatar

PDF Text Extractor

jirimoravcik/pdf-text-extractor

PDF Text Extractor allows you to extract text from PDF files. It also supports chunking of the text to prepare the data for usage with large language models.

👁 User avatar

Jiří Moravčík

1.1K

PDF Text Extractor

automation-lab/pdf-text-extractor

Extract text, metadata, and page-by-page content from PDF files. Provide PDF URLs and get structured JSON with full text, per-page text, page count, author, title, creation date, and more. Export as JSON, CSV, or Excel. No browser or proxy needed.

👁 User avatar

Stas Persiianenko

👁 Website To PDF Converter avatar

Website To PDF Converter

louisdeconinck/website-to-pdf-converter

Convert websites to high-quality PDF documents with customizable options. This powerful actor allows you to transform website pages with both static HTML and dynamic content into professional-grade PDFs, offering a wide range of customization features such as page format, orientation, margins, …

👁 User avatar

Louis Deconinck

144

5.0

Pdf API

vivid_astronaut/pdf

👁 User avatar

Fabio Suizu

👁 Universal Downloader avatar

Universal Downloader

dz_omar/universal-downloader

Powerful file downloader with proxy support, automatic retries, and cloud storage. Downloads any file type with streaming technology. Supports standby mode for instant API responses. Perfect for bulk downloads, geo-restricted content, and automation workflows.

👁 User avatar

FlowExtract API

490

5.0

👁 Extract text from PDF avatar

Extract text from PDF

akash9078/pdf-text-extractor

Efficiently extract text content from PDF files, ideal for data processing, content analysis, and automation workflows. Supports various PDF structures and outputs clean, readable text.

👁 User avatar

Akash Kumar Naik

107

👁 Pdf Text Extractor Pro avatar

Pdf Text Extractor Pro

dainty_screw/pdf-text-extractor-pro

PDF Text Extractor lets you quickly extract text from PDF files with high accuracy. Supports text chunking for AI, chatbots, and large language models (LLMs), making PDF-to-text conversion fast, clean, and ready for NLP or machine learning.

👁 User avatar

codemaster devops

5.0

👁 Fast Pdf Processor avatar

Fast Pdf Processor

contemporary_fruit/pdf-processor-actor

This API is a PDF Processing Service allowing users to upload a PDF to: Extract Text: Reads all text from the PDF and returns it as structured JSON data per page. Merge Pages: Creates a new PDF containing only the specific pages selected by the user. (260 characters)

👁 User avatar

Andric

👁 PDF Text Extractor - Bulk PDF to Text & Metadata avatar

PDF Text Extractor - Bulk PDF to Text & Metadata

santamaria-automations/pdf-extractor

Extract text and metadata from any PDF URL in bulk. Get page content, author, title, creation date, and more. Detects scanned PDFs that need OCR. Perfect for document analysis, research, and compliance.

👁 User avatar

Ale

URL: https://apify.com/onidivo/pdf-scraper

⇱ PDF Scraper · Apify

PDF Scraper

Features

Cost of usage

Bugs, issues, features, and feedback

Input

Output

You might also like

PDF Extractor 2.0

PDF Text Extractor

PDF Text Extractor

Website To PDF Converter

Pdf API

Universal Downloader

Extract text from PDF

Pdf Text Extractor Pro

Fast Pdf Processor

PDF Text Extractor - Bulk PDF to Text & Metadata