Book Metadata Scraper

Pricing

$8.00/month + usage

Book Metadata Scraper

Book Metadata Scraper uses the Open Library API to collect detailed book data by query. It extracts title, author, ISBN, publisher, publish year, pages, categories, ratings, description, cover image, and preview link. Outputs structured JSON for catalogs, apps, and research use.

Pricing

$8.00/month + usage

Rating

0.0

(0)

Developer

👁 Data Pilot

Data Pilot

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

3 months ago

Last modified

Overview

The Book Metadata Scraper is an Apify Actor that extracts rich book metadata from the Open Library database. It accepts book titles or search queries and retrieves detailed information including author, ISBN, publisher, description, ratings, and cover images. Whether you're building a book database, conducting publishing research, or powering a recommendation engine, this actor delivers accurate and structured book metadata efficiently.

With proxy support and respectful rate limiting, it ensures reliable access to Open Library's public API without interruptions.

Features

Multi-query Search – Search for multiple book titles or keywords in a single run.
Deep Metadata Fetching – Retrieves extended details (description, work data) from the Open Library Works API when search results are incomplete.
Cover Image Links – Returns direct URLs to book cover images via Open Library's cover service.
ISBN Extraction – Prioritizes 13-digit ISBNs for each book result.
Rating & Review Data – Includes average ratings and total ratings count where available.
Proxy Support – Optionally uses Apify residential proxies to avoid IP blocking.
Rate-Limit Friendly – Adds random delays between requests to respect API limits.
Dataset Integration – Automatically pushes all book metadata to your Apify dataset for easy export.

How It Works

Input – Provide a list of book titles or search queries (e.g., "Atomic Habits", "Stephen King").
Search – The actor queries the Open Library search API (/search.json) for each query.
Deep Fetch – For each result, it fetches the Work detail page (/works/{key}.json) to retrieve missing fields like description.
Build Output – Structures all available metadata into a clean record and pushes it to the dataset.
Repeat – Processes all queries with a random delay between requests.

Input

Field	Type	Default	Description
`queries`	Array of strings	`["Atomic Habits"]`	List of book titles or search keywords.
`max_results`	Integer	`10`	Maximum number of results to return per query.
`proxyConfiguration`	Object	`{}`	Apify proxy configuration (e.g., `{ "proxyGroups": ["RESIDENTIAL"] }`).

Example input:

{
"queries":["Atomic Habits","The Alchemist","Stephen King"],
"max_results":5,
"proxyConfiguration":{
"proxyGroups":["RESIDENTIAL"],
"apifyProxyCountry":"US"
}
}

Output

Each book is pushed as a separate dataset record with the following fields:

Field	Type	Description
`title`	string	Book title.
`author`	string	Author name(s), comma-separated.
`isbn`	string	ISBN-13 (preferred) or first available ISBN.
`publisher`	string	Publisher name.
`published_date`	string	First publish year.
`language`	string	Language code (uppercase, e.g., `"ENG"`).
`pages`	integer	Median page count (if available).
`categories`	array	Up to 5 subject/category tags.
`description`	string	First 500 characters of the book description.
`average_rating`	float	Average reader rating (if available).
`ratings_count`	integer	Total number of ratings.
`price`	string	Price (empty by default — Open Library is free).
`currency`	string	Currency code (default: `"USD"`).
`availability`	string	Availability status (default: `"In Stock"`).
`cover_image`	string	Direct URL to the book's cover image (large size).
`preview_link`	string	URL to the book's Open Library page.
`source`	string	Data source (always `"Open Library"`).

Example output:

{
"title":"Atomic Habits",
"author":"James Clear",
"isbn":"9780735211292",
"publisher":"Avery",
"published_date":"2018",
"language":"ENG",
"pages":320,
"categories":["Self-Help","Habits","Psychology","Productivity","Nonfiction"],
"description":"No matter your goals, Atomic Habits offers a proven framework for improving every day...",
"average_rating":4.4,
"ratings_count":28000,
"price":"",
"currency":"USD",
"availability":"In Stock",
"cover_image":"https://covers.openlibrary.org/b/id/10527843-L.jpg",
"preview_link":"https://openlibrary.org/works/OL17930368W",
"source":"Open Library"
}

Use Cases

Book Databases – Build and maintain a structured catalog of books and metadata.
Recommendation Engines – Power book recommendation systems with rich metadata.
Publishing Research – Analyze publishing trends, authors, and categories.
E-commerce – Enrich product listings with book descriptions, covers, and ISBNs.
Academic Research – Collect structured book data for literature or data science projects.
Content Aggregation – Aggregate book information for blogs, apps, or reading platforms.

Quick Start

Open on Apify – Visit the actor page and click Try for free.
Set Input – Enter book titles or search keywords in the queries field.
Adjust Settings – Set max_results and optional proxy configuration.
Run the Actor – Start the run and monitor progress in the logs.
Download Results – Export the dataset as JSON, CSV, or Excel once finished.

Technical Stack

Data Source – Open Library API (free, public)
HTTP Client – requests with custom headers and optional proxy support
Proxy – Apify Proxy (residential or datacenter)
Platform – Apify Actor — serverless, scalable, integrated with Dataset and Key-Value Store

Related Tools

Actor	Description
Goodreads Scraper	Extracts book ratings, reviews, and reading lists from Goodreads.
Amazon Book Scraper	Scrapes book listings, prices, and reviews from Amazon.
Google Books Scraper	Fetches book metadata and previews via the Google Books API.
ISBN Lookup Tool	Looks up detailed book info by ISBN from multiple data sources.
Book Price Comparator	Compares book prices across major online retailers.

Changelog

v1.0.0 – Initial Release

Multi-query search via Open Library API
Deep metadata fetching from Works endpoint
ISBN-13 prioritization
Cover image and preview link generation
Proxy configuration support
Dataset integration with random request delays

Pricing

Free for basic usage on Apify (up to certain compute limits).
Paid plans available for higher volume, priority support, and longer runs.
Proxy credits consumed if residential proxies are enabled.

Support & Feedback

Issues & Ideas – Open a ticket on the Apify Actor issue tracker.
Documentation – Visit Apify Docs for platform guides.
API Notes – This actor uses the Open Library public API. Please use responsibly and avoid excessive request rates.

Disclaimer: This actor accesses publicly available data from Open Library. Please ensure your usage complies with Open Library's terms of service. This actor is intended for research and informational purposes only.

👁 Goodreads Scraper avatar

Goodreads Scraper

datapilot/goodreads-scraper

Goodreads Scraper r uses the Open Library API to collect detailed book data by query. It extracts title, author, ISBN, publisher, publish year, pages, categories, ratings, description, cover image, and preview link. Outputs structured JSON for catalogs, apps, and research use.

👁 User avatar

Data Pilot

Book API

vivid_astronaut/book

👁 User avatar

Fabio Suizu

ISBN Book Metadata Lookup

automation-lab/isbn-lookup

Look up book metadata by ISBN-10 or ISBN-13. Get title, authors, publisher, publish date, page count, subjects, cover image, and more from Open Library and Google Books APIs. Export as JSON, CSV, or Excel.

👁 User avatar

Stas Persiianenko

👁 Amazon book scraper avatar

Amazon book scraper

datapilot/amazon-book-scraper

Amazon Book Scraper uses residential proxies to extract book details from Amazon product pages. It collects title, author, price, rating, reviews, ASIN, publisher, publication date, pages, language, description, and image. Outputs structured JSON for e-commerce analysis and research.

👁 User avatar

Data Pilot

3.0

👁 Open Library Scraper — Book Metadata in Bulk avatar

Open Library Scraper — Book Metadata in Bulk

devilscrapes/openlibrary-books-scraper

Search the Open Library API (the Internet Archive's open book catalogue) and export structured book metadata — title, authors, ISBNs, subjects, publish year, cover URL, edition count, OpenLibrary ID — to JSON or CSV. We handle pagination and retries across 30M+ works.

👁 User avatar

DevilScrapes

Open Library Book Search

agenscrape/open-library-book-search

Search millions of books from Open Library database. Extract book titles, authors, ISBN, publishers, publication years, cover images, and availability status. Perfect for bibliographic research and book databases.

👁 User avatar

Agenscrape

👁 ISBN Lookup, Book Metadata & Cover Finder avatar

ISBN Lookup, Book Metadata & Cover Finder

thescrapelab/isbn-book-metadata-enricher-cover-finder

ISBN lookup and book metadata enrichment for ISBN-10/ISBN-13 lists. Validate ISBNs and return titles, authors, publishers, dates, descriptions, page counts, categories, cover image URLs, Goodreads ratings, source links, and confidence scores. Fast batch runs on Apify with no paid book metadata APIs.

👁 User avatar

Inus Grobler

Goodreads Book Scraper - Ratings & Reviews

lulzasaur/goodreads-books-scraper

Scrape book data from Goodreads. Search by title or author. Extract ratings, reviews, page count, ISBN, genres, description, author info, and similar books from the world's largest book community.

👁 User avatar

lulz bot

👁 Open Library Book Intelligence avatar

Open Library Book Intelligence

benthepythondev/book-intelligence

Extract book metadata from Open Library's catalog of 20+ million books. Search by title, author, subject, or ISBN. Get cover images, ratings, edition counts, and publication data. Perfect for publishers, bookstores, libraries, app developers, and researchers.

👁 User avatar

ben

Goodreads Book Scraper

cloud9_ai/goodreads-book-scraper

Extract book data from Goodreads: title, author, rating, review count, genres, pages, publication date, ISBN, description. Search by keyword, browse by genre, or scrape list URLs. Perfect for publishing research, book recommendation engines.

👁 User avatar

cloud9

URL: https://apify.com/datapilot/book-metadata-scraper