VOOZH about

URL: https://apify.com/lofomachines/epstein-files-scraper-api

⇱ Epstein Files Scraper, Downloader & Search API Β· Apify


πŸ‘ Epstein Files Scraper, Downloader & Search API avatar

Epstein Files Scraper, Downloader & Search API

Pricing

from $1.00 / 1,000 results

Go to Apify Store

Epstein Files Scraper, Downloader & Search API

Fast search, extract, and structure Epstein files with keyword-based discovery, automatic PDF text parsing, and AI-ready output.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

πŸ‘ Lofomachines

Lofomachines

Maintained by Community

Actor stats

1

Bookmarked

70

Total users

3

Monthly active users

3 months ago

Last modified

Share

Epstein Files Search, Count & Extraction API

Search public Epstein-related files by keyword, count matching files in seconds, or extract detailed file records with AI-ready text.

This Apify Actor is built for no-code users, researchers, journalists, OSINT teams, and automation builders who want a simple way to query the Justice.gov Epstein document index without writing scraping logic.

Why People Use This Actor

  • Find relevant Epstein files by one or more keywords
  • Get a fast count of total matching files before launching larger workflows
  • Extract structured file records with highlights and text previews
  • Feed clean results into AI tools, spreadsheets, databases, and automations

Two Simple Modes

1. Fetch matching files and text

Use this mode when you want the actual dataset rows for matching files, including metadata, snippets, and extracted text when available.

Best for:

  • investigative research
  • legal review
  • AI summarization pipelines
  • bulk export to Airtable, Sheets, Notion, or BI tools

2. Count total files per keyword

Use this mode when you only want to know how many files are available for each keyword.

Best for:

  • quick demand checks
  • low-cost keyword validation
  • pre-flight checks before large extraction runs
  • dashboards and monitoring workflows

When this mode is used, each returned keyword count is pushed with the pricing event name count-keyword.

What You Get

  • Multi-keyword search in one run
  • Fast count-only mode with one result row per keyword
  • Detailed extraction mode with file metadata and text previews
  • Automatic PDF text parsing when readable text is available
  • Structured dataset output ready for no-code tools and AI workflows
  • Proxy-ready setup for better reliability against blocking

Input Example: Detailed Extraction

{
"mode":"fetch-details",
"keywords":[
"dentist",
"table"
],
"maxItems":20,
"proxyConfiguration":{
"useApifyProxy":true,
"apifyProxyGroups":[]
}
}

maxItems behavior in detailed mode:

  • 20 = up to 20 detailed results for each keyword
  • 0 = fetch all detailed results for each keyword

Input Example: Count Only

{
"mode":"count-only",
"keywords":[
"pinocchio",
"massage"
],
"proxyConfiguration":{
"useApifyProxy":true,
"apifyProxyGroups":[]
}
}

In count-only mode, maxItems is ignored because the Actor returns one count row per keyword.

Output Example: Detailed Extraction

{
"mode":"fetch-details",
"eventName":null,
"keyword":"epstein flight logs",
"page":1,
"documentId":"doc_123",
"chunkIndex":0,
"originFileName":"EFTA01638670.pdf",
"originFileUri":"https://www.justice.gov/epstein/files/DataSet%2010/EFTA01638670.pdf",
"sourceContentType":"application/pdf",
"extractedText":"This is a parsed text preview from the original PDF...",
"highlight":[
"...keyword match snippet..."
],
"processedAt":"2026-01-01T10:00:00Z",
"indexedAt":"2026-01-01T10:05:00Z"
}

Output Example: Count Only

{
"mode":"count-only",
"eventName":"count-keyword",
"keyword":"pinocchio",
"totalAvailableFiles":17,
"totalMatchingChunks":17,
"countedAt":"2026-03-18T16:48:11Z"
}

totalAvailableFiles is the important count for this mode. It is based on the unique file aggregation returned by the source, so it reflects total matching files rather than individual text chunks whenever that aggregation is available.

Great Fit For No-Code Workflows

You can use this Actor as a drop-in data source for:

  • n8n
  • Make
  • Zapier
  • Google Sheets
  • Airtable
  • Notion
  • custom webhook pipelines
  • LLM and RAG workflows

Typical flow:

  1. Run the Actor with one or more keywords
  2. Read the dataset output
  3. Filter by keyword, file name, or total count
  4. Send the results into your app, sheet, or AI workflow

High-Value Use Cases

  • Validate whether a keyword has enough matching files before buying or running a large extraction
  • Build recurring monitors for specific names, places, organizations, or phrases
  • Enrich investigations with file names, highlights, and extracted text
  • Create keyword intelligence dashboards with low-friction count lookups

Quick Start

  1. Open the Actor on Apify
  2. Choose your mode
  3. Enter one or more keywords
  4. Run the Actor
  5. Use the dataset output directly in your workflow

Data Source Note

This Actor is designed to process publicly accessible document sources and return structured outputs for analysis, research, and automation.

Discover More Actors

If you want more ready-to-use scrapers and automation tools, explore the rest of the catalog here:

Discover more actors by Lofomachines

You might also like

πŸ“„ PDF Text Extractor

simpleapi/pdf-text-extractor

πŸ“„βœ¨ PDF Text Extractor pulls clean text from PDF files fast and accurately. Perfect for parsing, indexing, and document search β€” saving hours on manual copy-paste. πŸš€πŸ“Š Try it now!

Pdf Text Extractor Pro

dainty_screw/pdf-text-extractor-pro

PDF Text Extractor lets you quickly extract text from PDF files with high accuracy. Supports text chunking for AI, chatbots, and large language models (LLMs), making PDF-to-text conversion fast, clean, and ready for NLP or machine learning.

πŸ‘ User avatar

codemaster devops

56

5.0

πŸ“„ PDF Text Extractor

api-empire/pdf-text-extractor

πŸ“„ PDF Text Extractor effortlessly converts PDF files into searchable text and clean output. ⚑ Fast, accurate, and user-friendlyβ€”ideal for document analysis, data extraction, and content indexing. πŸš€ Perfect for research, compliance, and automation.

AI Data Extraction from PDF

actor4you/ai-data-extraction-from-pdf

Extract text data from PDF files using AI. Upload PDFs directly or provide URLs. Supports text chunking for LLM workflows.

Extract text from PDF

akash9078/pdf-text-extractor

Efficiently extract text content from PDF files, ideal for data processing, content analysis, and automation workflows. Supports various PDF structures and outputs clean, readable text.

πŸ‘ User avatar

Akash Kumar Naik

109

πŸ“„ PDF Text Extractor

scraper-engine/pdf-text-extractor

πŸ“„βœ¨ PDF Text Extractor extracts clean text from PDF files with precision. ⚑ Perfect for data mining, document processing, and searchable archives. πŸš€ Fast, reliable, and efficient for your workflow!

πŸ‘ User avatar

Scraper Engine

2

πŸ“„ PDF Text Extractor

scrapio/pdf-text-extractor

πŸ“„ PDF Text Extractor (pdf-text-extractor) extracts clean text from PDF files for faster search, data analysis, and content reuse. ⚑ Saves time & boosts productivity for research, automation, and document workflows.

PDF Text Extractor

jirimoravcik/pdf-text-extractor

PDF Text Extractor allows you to extract text from PDF files. It also supports chunking of the text to prepare the data for usage with large language models.

πŸ‘ User avatar

JiΕ™Γ­ Moravčík

1.1K

PDF Scraper

onidivo/pdf-scraper

Scrape and extract text from PDF links.

πŸ‘ User avatar

Onidivo Technologies

512