PDF Extractor 2.0

Pricing

$30.00/month + usage

Try for free

Go to Apify Store

👁 PDF Extractor 2.0

PDF Extractor 2.0

Try for free

💫 Extract PDF Document Contents including Metadata, Images, Pages, Tables, Attachments, etc.

Pricing

$30.00/month + usage

Rating

0.0

(0)

Developer

👁 cat

cat

Maintained by Community

Actor stats

Bookmarked

173

Total users

Monthly active users

9 months ago

Last modified

Welcome to PDF Extractor

👁 Image

🍂 About PDF Format

👁 Image

Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems.[2][3] Based on the PostScript language, each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, vector graphics, raster images and other information needed to display it. PDF has its roots in "The Camelot Project" initiated by Adobe co-founder John Warnock in 1991.[4] PDF was standardized as ISO 32000 in 2008.[5] The last edition as ISO 32000-2:2020 was published in December 2020.

🍂 About This Actor

💫 Extract contents from PDF documents

Features :

⭐ Extract PDF pages as Text or Image (SVG, PNG, JPEG).
⭐ Extract PDF Metadata.
⭐ Extract PDF Table of Contents
⭐ Extract PDF Tables
⭐ Extract Encrypted PDF (password protected)
⭐ Extract Embedded images.
⭐ Extract Attachments.
⭐ Extract multiple URL files

🍂 Tutorial

Input Parameters

Name	Type	Description
`url`	Array `[String]`	List of PDF document `URL`
`content`	String	Output pages format (`text, svg, png, jpg`)
`images`	Boolean `(true/false)`	Extract embedded images
`attachments`	Boolean `(true/false)`	Extract embedded files
`tables`	Boolean `(true/false)`	Extract tables

Notes : All extracted resources other than TEXT will be saved to default Key-Value storage.

Dataset Output Format :

[
# URL-1: Metadata
{"metadata":{"headers":{...},"url":"...","mime":"..."}},
# URL-1: Page Contents
{"index":0,"content":"...page-0 contents...","images":[...],"tables":[...]},
{"index":1,"content":"...page-1 contents...","images":[...],"tables":[...]},
...
# URL-2: Metadata
{"metadata":{"headers":{...},"url":"...","mime":"..."}},
# URL-2: Page Contents
{"index":0,"content":"...page-0 contents...","images":[...],"tables":[...]},
{"index":1,"content":"...page-1 contents...","images":[...],"tables":[...]},
...
]

🍂 Output Samples

PDF Sample #1

URL : https://www.w3.org/WAI/WCAG21/working-examples/pdf-table/table.pdf

{
}

PDF Sample #2

URL : https://apify.com/img/web-scraping/beginners-guide-to-web-scraping.pdf

{
}

✏️ Support

⚡️ Feel free to reach out to the developer for any issues or suggestions for improvement.

👁 Image

👁 PDF Scraper avatar

PDF Scraper

onidivo/pdf-scraper

Scrape and extract text from PDF links.

👁 User avatar

Onidivo Technologies

512

👁 PDF Text Extractor avatar

PDF Text Extractor

jirimoravcik/pdf-text-extractor

PDF Text Extractor allows you to extract text from PDF files. It also supports chunking of the text to prepare the data for usage with large language models.

👁 User avatar

Jiří Moravčík

1.1K

👁 11880.com Business Directory Scraper avatar

11880.com Business Directory Scraper

santamaria-automations/11880-de-scraper

Scrape business listings from 11880.com, one of Germany's leading business directories. Extract company names, addresses, phone numbers, ratings, reviews, opening hours, and more. Supports keyword and location-based search with pagination.

👁 User avatar

Ale

PDF Text Extractor

automation-lab/pdf-text-extractor

Extract text, metadata, and page-by-page content from PDF files. Provide PDF URLs and get structured JSON with full text, per-page text, page count, author, title, creation date, and more. Export as JSON, CSV, or Excel. No browser or proxy needed.

👁 User avatar

Stas Persiianenko

👁 11880.com Branchenbuch Scraper avatar

11880.com Branchenbuch Scraper

m3web/11880-com-branchenbuch-scraper

Actor für 11880.com: findet Unternehmen nach Branche und extrahiert Kontaktdaten (E‑Mail, Telefon, Adresse). EN: Scraper for German companies listed in the 11880.com Branchenbuch (business directory).

👁 User avatar

M3Web

5.0

👁 Google Ads Transparency Scraper - Competitor Ads avatar

Google Ads Transparency Scraper - Competitor Ads

logiover/google-ads-transparency-scraper

Google Ads Transparency Center API alternative: scrape competitor ads to CSV/JSON. Impressions, spend & regions export, no login or API key.

👁 User avatar

Logiover

👁 Google AI Mode Scraper avatar

Google AI Mode Scraper

lexis-solutions/google-ai-scraper

Scrape AI-generated answers from Google’s AI Overview—extract organized paragraphs, lists, headings, highlighted key terms, and source citations with URLs, titles, and snippets. Perfect for research, content creation, SEO analysis, and training data. Fast, reliable, customizable.

👁 User avatar

Lexis Solutions

👁 🔥 Web Traffic Generator | 🚀 WebRocket 🚀 avatar

🔥 Web Traffic Generator | 🚀 WebRocket 🚀

bebity/web-traffic-generator

🚀💥 Introducing WebRocket! 💥 Supercharge your website 📈, deep crawling 🕸️, and robust error handling 🤖. Blast off with start URLs 🚀, choose simultaneous visitors 🧑🏻‍🤝‍🧑🏻, and set visit numbers #️⃣. Customize the stay duration ⌛, pick device types 📱🖥️📟, and use residential proxies 🌍🏠

👁 User avatar

Bebity

15K

4.7

👁 Google Ads Scraper avatar

Google Ads Scraper

parseforge/google-ads-scraper

Track any advertiser’s campaigns with our Google Ads Transparency Center scraper. Search by name, domain, or URL with region filtering. Get ad creatives, formats, run dates, targeting data, impressions, and more. Perfect for professionals who need structured ad transparency data fast.

👁 User avatar

ParseForge

👁 Bluesky Posts Scraper avatar

Bluesky Posts Scraper

lexis-solutions/bluesky-posts-scraper

The Apify Bluesky Posts Scraper allows a programmatic search for posts on Bluesky and the option to export to CSV, JSON, Excel, or integration with Zapier, Make, or any custom workflow.

👁 User avatar

Lexis Solutions

255

4.5

URL: https://apify.com/jupri/pdf-extractor-2-0

⇱ PDF Extractor 2.0 · Apify

PDF Extractor 2.0

Welcome to PDF Extractor

🍂 About PDF Format

🍂 About This Actor

Features :

🍂 Tutorial

Input Parameters

Dataset Output Format :

🍂 Output Samples

PDF Sample #1

PDF Sample #2

✏️ Support

You might also like

PDF Scraper

PDF Text Extractor

11880.com Business Directory Scraper

PDF Text Extractor

11880.com Branchenbuch Scraper

Google Ads Transparency Scraper - Competitor Ads

Google AI Mode Scraper

🔥 Web Traffic Generator | 🚀 WebRocket 🚀

Google Ads Scraper

Bluesky Posts Scraper