VOOZH about

URL: https://apify.com/parseforge/gutendex-project-gutenberg-books-scraper

โ‡ฑ Project Gutenberg Books Scraper | 70K+ Free eBooks ยท Apify


๐Ÿ‘ Project Gutenberg Books Scraper | 70K+ Free eBooks avatar

Project Gutenberg Books Scraper | 70K+ Free eBooks

Pricing

from $19.00 / 1,000 result items

Go to Apify Store

Project Gutenberg Books Scraper | 70K+ Free eBooks

Export 70,000+ public-domain books from Project Gutenberg via the Gutendex API. Search by keyword, language, topic, or author lifespan, or fetch by book ID. Pull titles, authors, subjects, languages, download links, and full-text formats. Download as CSV, Excel, JSON, or XML.

Pricing

from $19.00 / 1,000 result items

Rating

0.0

(0)

Developer

๐Ÿ‘ ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a month ago

Last modified

Share

๐Ÿ‘ ParseForge Banner

๐Ÿ“š Project Gutenberg (Gutendex) Scraper

๐Ÿš€ Export 70,000+ public-domain books with metadata and full-text download links in seconds.

๐Ÿ•’ Last updated: 2026-05-26 ยท ๐Ÿ“Š 10 fields per record ยท 70,000+ books ยท 60+ languages ยท 10+ formats per book

This Apify Actor extracts structured data from Project Gutenberg (Gutendex), returning clean JSON / CSV / Excel / XML datasets ready for analytics, integrations, or research workflows. Built by ParseForge for reliability and freshness.

๐ŸŽฏ Target Audience๐Ÿ’ก Primary Use Cases
Data analysts, engineers, researchersAnalytics pipelines, BI dashboards, datasets
SaaS, fintech, marketing, ops teamsLead gen, enrichment, monitoring
Hobbyists, journalists, indie devsSide projects, content, exploration

๐Ÿ“‹ What the Project Gutenberg (Gutendex) Scraper does

  • Queries the public Project Gutenberg (Gutendex) API / feed and structures the response
  • Returns one record per item with 10 normalized fields
  • Supports filters configurable from the input schema
  • Outputs to CSV, Excel, JSON, XML via Apify dataset
  • Auto-limits to 10 items on the free plan; up to 1,000,000 on paid

๐Ÿ’ก Why it matters: clean, ready-to-query data without manual scraping, parsing, or babysitting an API client.

๐ŸŽฌ Full Demo (๐Ÿšง Coming soon)

โš™๏ธ Input

FieldTypeRequiredDescription
maxItemsintegerNoMax items to return (free: 10, paid: 1,000,000)
{"maxItems":50}
{"maxItems":1000}

โš ๏ธ Good to Know: the free plan caps results at 10 items per run. Upgrade to a paid plan to unlock the full dataset.

๐Ÿ“Š Output

FieldTypeDescription
๐Ÿ”น idstringProject Gutenberg (Gutendex) id field
๐Ÿ”น titlestringProject Gutenberg (Gutendex) title field
๐Ÿ”น authorsstringProject Gutenberg (Gutendex) authors field
๐Ÿ”น languagesstringProject Gutenberg (Gutendex) languages field
๐Ÿ”น download_countstringProject Gutenberg (Gutendex) download_count field
๐Ÿ•’ scrapedAtstringISO timestamp of when the record was collected
โŒ errorstringnull
{
"id":"...",
"title":"...",
"scrapedAt":"2026-05-26T00:00:00.000Z",
"error":null
}

โœจ Why choose this Actor

DifferentiatorBenefit
๐ŸŸข Real-time public dataAlways fresh, never cached
๐ŸŸข Structured outputReady for BI, Excel, SQL imports
๐ŸŸข Pay-per-result pricingPay only for actual data collected
๐ŸŸข Apify-hosted runsNo servers, no maintenance
๐ŸŸข Free tier previewTest with 10 items before scaling

๐Ÿ“ˆ How it compares to alternatives

MethodSetup timeReliabilityMaintenanceCost
Manual scriptingHoursMediumHighDev time
Generic web scrapersHoursLowHighVariable
This Actor30 secondsHighNonePay-per-result

๐Ÿš€ How to use

  1. Create a free Apify account with $5 credit
  2. Open this Actor and click Try for free
  3. Configure the input (maxItems and any filters)
  4. Click Start
  5. Download the dataset as CSV / Excel / JSON / XML

๐Ÿ’ผ Business use cases

๐Ÿ“Š Analytics & BI โ€” feed Project Gutenberg (Gutendex) records into Looker, Tableau, Metabase for live dashboards.

๐Ÿง  Data enrichment โ€” append Project Gutenberg (Gutendex) fields to your CRM / CDP records for better targeting.

๐Ÿ” Monitoring โ€” schedule daily runs to detect new records, changes, or anomalies.

๐Ÿค– ML training data โ€” use structured Project Gutenberg (Gutendex) data as features in models or test fixtures.

๐Ÿ”Œ Automating Project Gutenberg (Gutendex) Scraper

Connect via Apify integrations: Make, Zapier, n8n, Slack, Discord, Airbyte, Google Drive, Google Sheets, GitHub, Webhook, REST API.

๐ŸŒŸ Beyond business use cases

๐Ÿ”ฌ Research โ€” academics, journalists, policy researchers building public datasets.

๐ŸŽจ Personal โ€” hobbyists tracking favorite topics, building personal projects, exploring data.

๐Ÿค Non-profit โ€” NGOs, civic-tech, open-data initiatives needing structured Project Gutenberg (Gutendex) extracts.

๐Ÿงช Experimentation โ€” students, builders, indie devs prototyping with real-world data.

๐Ÿค– Ask an AI assistant about this scraper

Drop a link to this page into ChatGPT, Claude, Perplexity, or Copilot and ask: "What can I do with the Project Gutenberg (Gutendex) Scraper?"

โ“ Frequently Asked Questions

โ“ Is the data fresh? Yes. Every run queries Project Gutenberg (Gutendex) in real time. No cached responses.

๐Ÿ’ฐ How does pricing work? Pay-per-result: charged once per record collected. Free plan: 10 items per run preview.

๐Ÿงพ What format is the output? CSV, Excel, JSON, XML โ€” Apify dataset supports all four.

๐Ÿ” Do I need an API key? No. Project Gutenberg (Gutendex) public data only, no login required.

๐Ÿ“ˆ What's the max I can scrape? 1,000,000 items per run on paid plans.

โฑ๏ธ How long does a run take? Typically seconds to minutes depending on maxItems.

๐ŸŒ Is this affiliated with Project Gutenberg (Gutendex)? No. Independent tool. Only publicly accessible data collected.

๐Ÿ› ๏ธ What if a run fails? You're never charged for failed records. Errors appear as {error: "..."} rows.

๐Ÿค Can I use this commercially? Yes, under Apify's standard terms and Project Gutenberg (Gutendex)'s public-data policies.

๐Ÿ“ž Where do I get help? Open our contact form.

๐Ÿ”Œ Integrate with any app

Apify's full integration list: Make ยท Zapier ยท n8n ยท Slack ยท Discord ยท Webhook ยท REST API ยท GitHub ยท Airbyte ยท Google Drive ยท Google Sheets ยท Gmail ยท HubSpot ยท Pipedream.

๐Ÿ”— Recommended Actors

ActorDescription
Wikipedia On This Day ScraperDaily Wikipedia historical events
Public Holidays ScraperHolidays for 100+ countries
SPDX Software Licenses ScraperOpen-source license metadata
ISO Country Codes ScraperIBAN + ISO country codes

๐Ÿ’ก Pro Tip: browse the complete ParseForge collection.

๐Ÿ†˜ Need Help? Open our contact form

โš ๏ธ Disclaimer: independent tool, not affiliated with Project Gutenberg (Gutendex). Only publicly available data collected.

You might also like

Project Gutenberg Books Scraper

gio21/gutenberg-books-scraper

Scrape public-domain books from Project Gutenberg via the Gutendex API. Filter by topic, author, language, search query. Returns title, authors, languages, copyright, download_count, formats (EPUB, MOBI, TXT, HTML), subjects, bookshelves. Pay per book returned.

Project Gutenberg Books Scraper

parseforge/project-gutenberg-books-scraper

Search 75,000+ free public-domain books from Project Gutenberg. Returns title, author with birth/death years, cover image, plain-text and EPUB download URLs, Kindle and HTML formats, subjects, bookshelves, language, copyright status, summaries and download counts. Filter by author or language.

Project Gutenberg Research Scraper

happyfhantum/project-gutenberg-research-scraper

Exhaustively searches Project Gutenberg's 70,000+ free ebooks using multi-page pagination and smart filtering. Perfect for academic research, finding complete author works, or discovering books on specialized topics. Gets all results, not just the first page.

Free eBook Scraper

epctex/gutenberg-scraper

Explore and Download Free eBooks - Find and download a wide selection of free eBooks from Project Gutenberg. Search by keywords and language preferences. Discover literary gems in multiple formats.

๐Ÿ“š Open Library Intelligence - 20M+ Books & Covers

benthepythondev/openlibrary-book-intelligence

Search and extract book data from Open Library's database of 20+ million books. Get titles, authors, publishers, publication dates, ISBNs, covers, subjects, and edition info. Search by title, author, ISBN, or subject. Free alternative to Google Books API.

Google Books Search Scraper

seemuapps/google-books-search-scraper

Search the Google Books catalog and export book metadata. Title, authors, ISBN, ratings, description, and links to a clean dataset.