Pricing
from $10.00 / 1,000 results
Goodreads Scraper
Scrape Goodreads book data. Search by title, author, or ISBN. Returns ratings, reviews, genres, page counts, and publication info.
Pricing
from $10.00 / 1,000 results
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
6
Total users
2
Monthly active users
2 months ago
Last modified
Categories
Share
An Apify Actor that scrapes book data from Goodreads shelf/genre pages. Built with Crawlee CheerioCrawler for fast, efficient HTML parsing.
What it does
This scraper extracts book data from Goodreads shelf pages (e.g., goodreads.com/shelf/show/science-fiction). Each shelf page lists 50 popular books in that genre, ranked by how many users shelved them.
Two modes of operation
-
Shelf mode (default): Scrapes book listings from shelf pages. Fast -- extracts title, author, rating, rating count, published year, shelf count, and cover image directly from the listing.
-
Detail mode (
scrapeDetails: true): Also visits each book's individual page to extract rich structured data from Goodreads' JSON-LD schema, including full description, genres, page count, ISBN, book format, language, awards, and review count.
Input
| Parameter | Type | Default | Description |
|---|---|---|---|
searchQueries | string[] | ["science-fiction"] | Shelf/genre names to scrape. Maps to goodreads.com/shelf/show/{name}. |
maxBooks | integer | 100 | Maximum books per shelf. Set to 0 for unlimited. |
scrapeDetails | boolean | false | Visit each book's detail page for full data (slower). |
proxyConfiguration | object | { useApifyProxy: false } | Proxy settings for large-scale runs. |
Popular shelf names
science-fiction, fantasy, mystery, romance, non-fiction, thriller, horror, historical-fiction, young-adult, classics, biography, self-help, poetry, graphic-novels, philosophy, true-crime, dystopian, adventure, humor, manga
Output
Shelf mode fields
| Field | Type | Description |
|---|---|---|
title | string | Book title (may include series info) |
author | string | Author name |
authorUrl | string | Link to author's Goodreads page |
rating | number | Average rating (1-5 scale) |
ratingCount | number | Total number of ratings |
publishedYear | number | Year of first publication |
shelfCount | number | Times shelved under this genre |
coverImage | string | URL of book cover image |
bookId | string | Goodreads book ID |
url | string | Full Goodreads URL for the book |
searchQuery | string | The shelf/genre name used |
scrapedAt | string | ISO 8601 timestamp |
Additional detail mode fields
| Field | Type | Description |
|---|---|---|
description | string | Full book description |
genres | string[] | List of genres/tags |
isbn | string | ISBN number |
bookFormat | string | Format (Hardcover, Paperback, etc.) |
numberOfPages | number | Page count |
language | string | Book language |
awards | string | Awards received |
reviewCount | number | Total number of text reviews |
authors | object[] | Array of author objects with name and URL |
publicationInfo | string | Full publication details |
Example usage
Basic: Top science fiction books
{"searchQueries":["science-fiction"],"maxBooks":50}
Multiple genres with details
{"searchQueries":["fantasy","mystery","romance"],"maxBooks":100,"scrapeDetails":true}
Large-scale with proxy
{"searchQueries":["science-fiction","fantasy","thriller","horror"],"maxBooks":500,"scrapeDetails":true,"proxyConfiguration":{"useApifyProxy":true}}
Example output
{"title":"Dune (Dune, #1)","author":"Frank Herbert","authorUrl":"https://www.goodreads.com/author/show/58.Frank_Herbert","rating":4.29,"ratingCount":1645579,"publishedYear":1965,"shelfCount":21643,"coverImage":"https://i.gr-assets.com/images/S/compressed.photo.goodreads.com/books/1555447414l/44767458._SX318_.jpg","bookId":"44767458","url":"https://www.goodreads.com/book/show/44767458-dune","searchQuery":"science-fiction","scrapedAt":"2026-03-17T12:00:00.000Z"}
Technical notes
- Uses CheerioCrawler (HTTP-only, no browser) for maximum speed
- Shelf pages (
/shelf/show/) and book pages (/book/show/) are allowed by Goodreads robots.txt - Rate-limited to avoid overloading servers (30-40 requests/minute)
- Shelf pages return 50 books per page, paginated with
?page=N - Detail pages use Goodreads' JSON-LD
@type: Bookstructured data for reliable extraction - No Cloudflare or CAPTCHA protection on these endpoints
Running locally
$apify run --purge
Make sure to set your input in storage/key_value_stores/default/INPUT.json.
Deploy to Apify
apify loginapify push
Related Scrapers
More marketplace scrapers and data tools by lulzasaur:
- AbeBooks Scraper โ Rare and used books
- Bonanza Scraper โ Online marketplace listings
- Contractor License Verifier โ Multi-state license verification
- Craigslist Scraper โ Classifieds and for-sale posts
- Grailed Scraper โ Luxury fashion resale
- Houzz Scraper โ Home improvement professionals
- IMDb Scraper โ Movie and TV show data
- Nurse License Verifier โ State nursing board verification
- OfferUp Scraper โ Local marketplace listings
- Poshmark Scraper โ Fashion resale marketplace
- PSA Population Report โ Card grading data
- Redfin Scraper โ Real estate listings and prices
- Reverb Scraper โ Music gear marketplace
- StubHub Scraper โ Event ticket prices
- Swappa Scraper โ Used electronics marketplace
- TCGPlayer Scraper โ Trading card prices
- ThriftBooks Scraper โ Used book prices
- Thumbtack Scraper โ Local service professionals
