VOOZH about

URL: https://apify.com/devilscrapes/openlibrary-books-scraper

⇱ Open Library Scraper β€” Book Metadata API & ISBN Lookup Β· Apify


πŸ‘ Open Library Scraper β€” Book Metadata in Bulk avatar

Open Library Scraper β€” Book Metadata in Bulk

Pricing

Pay per event

Go to Apify Store

Open Library Scraper β€” Book Metadata in Bulk

Search the Open Library API (the Internet Archive's open book catalogue) and export structured book metadata β€” title, authors, ISBNs, subjects, publish year, cover URL, edition count, OpenLibrary ID β€” to JSON or CSV. We handle pagination and retries across 30M+ works.

Pricing

Pay per event

Rating

0.0

(0)

Developer

πŸ‘ DevilScrapes

DevilScrapes

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

18 days ago

Last modified

Categories

Share


🎯 What this scrapes

Open Library is the Internet Archive's catalogue of 30M+ works β€” the open, canonical bibliographic source that most reliable book-metadata pipelines lean on. When Goodreads shut their developer API in 2020, they left a gap that five years later developers are still Googling around. Open Library fills it: no licensing hurdles, no API key friction, free bulk export β€” if you can navigate the pagination and handle the upstream's occasional rate-limiting.

This Actor turns a free-form query (title, author, ISBN, subject) into typed dataset rows with cover URL, subjects, edition count, and the canonical Open Library key. We pace requests against the upstream, retry on transient errors, and surface partial successes loudly β€” so your library, recommender, or research dataset gets the rows it expects.

πŸ”₯ What we handle for you

  • πŸ›‘οΈ Browser fingerprint rotation β€” curl-cffi impersonates real Chrome / Firefox / Safari TLS handshakes so the target sees a browser, not Python.
  • 🌐 Residential proxy rotation via Apify Proxy β€” fresh session and exit IP on every block.
  • πŸ” Retries with exponential backoff on 408 / 429 / 5xx β€” up to 5 attempts per page, Retry-After honoured.
  • 🧱 Rate-limit-aware pacing β€” when the target pushes back, we slow down instead of getting banned.
  • 🧊 Clean, typed dataset rows β€” Pydantic-validated, ISO-8601 timestamps, stable IDs, JSON / CSV / Excel export straight from the Apify Console.
  • πŸ’° Pay-Per-Event pricing β€” you only pay for results that hit your dataset. No data, no charge.

πŸ’‘ Use cases

  • Goodreads alternative API β€” rebuild the bibliographic data layer Goodreads took away in 2020. Title, authors, ISBNs, subjects, cover URL, edition count β€” the fields every book-rec app needs.
  • ISBN lookup at bulk scale β€” enrich a CSV of book titles with ISBNs, authors, and covers in one run. Better unit economics than a per-request ISBN lookup API.
  • Free book metadata API β€” feed a reading-list dashboard, a library catalogue app, or a fiction-RAG backend with structured Open Library data. No licensing restrictions on bibliographic metadata.
  • Discovery pipelines β€” list every Asimov novel + edition count for a fan-site backend, or enumerate every title tagged "machine learning" for a curated reading list.
  • Digital humanities β€” seed subject-tag corpora for distant-reading research, cultural-analytics, or AI-tutor curriculum ingestion.

βš™οΈ How to use it

  1. Click Try for free at the top of the page.
  2. Fill in the input form β€” most fields have sensible defaults.
  3. Click Start. Output streams into the run's dataset.
  4. Export from Storage β†’ Dataset as JSON, CSV, or Excel β€” or fetch via the API.

πŸ“₯ Input

FieldTypeRequiredDefaultNotes
searchQuerystringyes'isaac asimov foundation'Free-text search. Open Library matches across title, author, subject, ISBN.
searchFieldstringno'all'Narrow which field your query targets. all matches everywhere.
maxResultsintegerno30Max books to return. API caps per page at 100; we paginate.
languagestringno''3-letter ISO-639-2 code, e.g. eng, spa, fre. Leave empty for all.
proxyConfigurationobjectno{'useApifyProxy': False}Open Library is open. Proxy optional.

Example input

{
"searchQuery":"foundation asimov",
"searchField":"all",
"maxResults":3,
"proxyConfiguration":{
"useApifyProxy":false
}
}

πŸ“€ Output

Every row is one dataset item.

FieldTypeNotes
openlibrary_keystringOpen Library work key (e.g. /works/OL12345W).
titlestringWork title.
subtitle['string', 'null']Subtitle, when present.
authorsarrayAuthor names.
first_publish_year['integer', 'null']Earliest publication year recorded.
edition_countintegerNumber of editions Open Library tracks.
languagesarrayLanguage codes detected across editions.
subjectsarraySubject tags (up to 30, truncated).
isbnsarrayISBNs detected (10 and 13).
publishersarrayPublishers across editions (deduped).
cover_id['integer', 'null']Open Library cover image ID.
cover_url_l['string', 'null']Large cover image URL.
ratings_average['number', 'null']Average rating where Open Library has one.
ratings_count['integer', 'null']Rating count.
ebook_access['string', 'null']Open Library's e-book availability β€” public, borrowable, no_ebook, printdisabled.
work_urlstringCanonical Open Library URL.
scraped_atstringWhen this row was recorded.

Example output

{
"openlibrary_key":"/works/OL471576W",
"title":"Foundation",
"authors":[
"Isaac Asimov"
],
"first_publish_year":1951,
"edition_count":142,
"work_url":"https://openlibrary.org/works/OL471576W"
}

πŸ’° Pricing

Pay-Per-Event β€” you pay only when these events fire:

EventUSDWhat it is
actor-start$0.005One-off warm-up charge per run
result$0.0015Per dataset item

Example: 1 000 results at the rates above β‰ˆ $1.50. No subscription, no minimum, no card to start β€” Apify gives every new account $5 of free credit.

🚧 Limitations

  • Search uses Open Library's relevance ranking β€” for canonical bibliographic data (LCSH/Dewey), use a dedicated MARC source. Subjects are tags, not curated taxonomies.
  • This Actor exports metadata only β€” titles, ISBNs, authors, subjects, cover URLs, publish years. It does not download book text or full-text content. For public-domain full-text, follow work_url to the Internet Archive reader.
  • Open Library has thinner rating data than Goodreads. Treat ratings_average with caution for niche or older works.

❓ FAQ

Is this a Goodreads alternative API?

Yes, for the bibliographic data layer. Goodreads shut their developer API in December 2020. Open Library provides the same core fields β€” title, authors, ISBNs, subjects, cover URL, edition count β€” under a fully open licence. This Actor is the managed bulk-export layer on top of that catalogue.

Can I do ISBN lookup in bulk?

Yes. Pass searchField: "isbn" with a specific ISBN-10 or ISBN-13 as searchQuery, or use searchField: "all" with a title + author combination to retrieve ISBNs at scale. Each result row returns the full isbns array for all editions of a work.

Where's the book description / blurb?

The search API doesn't include long descriptions; for those, follow up with /works/{key}.json. We surface enough to enrich a catalogue or recommendation engine.

Why are some ISBNs missing?

Older works weren't always catalogued with ISBNs. We return what Open Library has.

Can I download the book text?

Not via this Actor β€” we export metadata only. Visit work_url and follow Open Library's reader flow for public-domain full text.

What about the Open Library API directly?

Open Library's /search.json endpoint is public, but handling pagination, rate-limit pacing, retries, and clean typed output at scale is the work this Actor absorbs. We handle the blocks so you get consistent rows.

Is the data licensed for commercial use?

Open Library's bibliographic metadata is released under CC0 (public domain). Always verify the licence terms for your specific use case at openlibrary.org.

πŸ’¬ Your feedback

Spotted a bug, hit a weird edge case, or need a new field? Open an issue on the Actor's Issues tab on Apify Console β€” we ship fixes weekly and we read every report.


You might also like

Open Library Book Search

gentle_cloud/open-library-book-search

Search and extract book data from Open Library (openlibrary.org) β€” titles, authors, publishers, ISBNs, ratings, reading stats, cover images, and more. Free API, no key required.

Open Library Book Scraper – Cheap πŸ“šπŸŒπŸ”

scrapestorm/open-library-book-scraper---cheap

Easily collect books, authors & reading lists from Open Library Extract structured book and literary data from OpenLibrary.org, the world’s largest open book database maintained by the Internet Archive. Collect book titles, authors, subjects, editions, availability, reading lists, and more πŸ“šπŸŒ

2

Open Library Book Scraper

moving_beacon-owner1/my-actor-80

Extract book data from Open Library, the Internet Archive's open book database featuring over 20 million books, more than 10 million authors, and 40 million editions. Gather titles, authors, cover images, ISBNs, publishers, subjects, ratings, reading statistics, and more.

2

Book Metadata Scraper

datapilot/book-metadata-scraper

Book Metadata Scraper uses the Open Library API to collect detailed book data by query. It extracts title, author, ISBN, publisher, publish year, pages, categories, ratings, description, cover image, and preview link. Outputs structured JSON for catalogs, apps, and research use.

Openlibrary Book Intelligence

benthepythondev/openlibrary-book-intelligence

Search and extract book data from Open Library's database of 20+ million books. Get titles, authors, publishers, publication dates, ISBNs, covers, subjects, and edition info. Search by title, author, ISBN, or subject. Free alternative to Google Books API.

Open Library Books Scraper

gio21/openlibrary-books-scraper

Search and scrape books on Open Library by title, author, subject, or ISBN. Returns title, authors, first publish year, edition count, ISBNs, cover image, language, ebook access status. Pay per book returned.

Open Library Scraper

viralanalyzer/open-library-scraper

Search and extract book data from Open Library: titles, authors, editions, subjects, and availability. Literary research at scale.

2

4.7

Open Library Book Intelligence

benthepythondev/book-intelligence

Extract book metadata from Open Library's catalog of 20+ million books. Search by title, author, subject, or ISBN. Get cover images, ratings, edition counts, and publication data. Perfect for publishers, bookstores, libraries, app developers, and researchers.