VOOZH about

URL: https://apify.com/shahidirfan/themoviedb-scraper

⇱ TheMoviedb Scraper Β· Apify


Pricing

Pay per usage

Go to Apify Store

TheMoviedb Scraper

Introducing TheMoviedb Scraper! This versatile actor extracts rich movie, TV, and celebrity data from TheMovieDB. It offers two modes: use the official API for fast, stable results, or scrape directly without a key. Your complete media data solution.

Pricing

Pay per usage

Rating

5.0

(3)

Developer

πŸ‘ Shahid Irfan

Shahid Irfan

Maintained by Community

Actor stats

2

Bookmarked

25

Total users

3

Monthly active users

15 days ago

Last modified

Share

TMDb Comprehensive Scraper

Collect rich movie, TV show, and people data from The Movie Database in a single run. Build clean datasets for analytics, content apps, and market tracking with flexible filters, pagination, and structured output. The actor focuses on completeness, stable extraction, and fast large-batch collection.

Features

  • Multi-content extraction β€” Collect movie, tv, person, or both from one actor.
  • Search and discovery modes β€” Run targeted keyword searches or broad discovery with filters.
  • Rich record coverage β€” Include credits, reviews, keywords, images, videos, watch providers, recommendations, and external IDs.
  • Cleaner dataset output β€” Produces consolidated records with null/empty values removed.
  • Duplicate protection β€” Deduplicates items by TMDb ID across pages and queries.
  • Fast batch collection β€” Uses direct structured responses, lightweight retries, and fallback routing to keep runs quick.
  • Configurable scale controls β€” Tune limits, pages, and concurrency for your workload.

Use Cases

Content Intelligence

Track trending titles, genres, ratings, and audience engagement signals for dashboards and reports.

Catalog Enrichment

Enhance internal catalogs with overviews, media assets, credits, and provider metadata.

Audience and Creator Research

Collect people profiles and filmography context for talent analysis and editorial planning.

Competitive Monitoring

Monitor what is rising across movies and TV by popularity, vote signals, and release windows.

Automation Pipelines

Feed consistent JSON output into ETL workflows, BI tools, and downstream notifications.


Input Parameters

ParameterTypeRequiredDefaultDescription
apiKeyStringNoβ€”TMDb API key if you want to run with your own credentials.
useApiFirstBooleanNotruePrioritize key-based mode when apiKey is provided.
contentTypeStringNo"tv"One of movie, tv, person, both.
searchQueriesStringNo""Comma-separated search terms for movie/TV/person search.
genreIdsStringNo""Comma-separated genre IDs for discovery filtering.
peopleQueryStringNo""Comma-separated search terms for people extraction.
resultsWantedIntegerNo5Max movie/TV records per query.
peopleResultsWantedIntegerNo3Max people records per query.
maxPagesIntegerNo5Max paginated result pages to scan per query.
maxConcurrencyIntegerNo8Max parallel detail requests.
sortByStringNo"popularity.desc"Sorting used in discovery mode.
yearFromIntegerNoβ€”Earliest release or first-air year for discovery filtering.
yearToIntegerNoβ€”Latest release or first-air year for discovery filtering.
minVoteCountIntegerNo0Minimum vote count for discovery filtering.
minVoteAverageNumberNo0Minimum vote average for discovery filtering.
collectPeopleBooleanNofalseInclude cast and crew blocks in content records.
collectReviewsBooleanNofalseInclude review blocks in content records.
collectImagesBooleanNofalseInclude images metadata in content records.
collectKeywordsBooleanNofalseInclude keywords/tags in content records.
collectVideosBooleanNofalseInclude videos/trailers metadata in content records.
collectWatchProvidersBooleanNofalseInclude watch provider availability data.
maxReviewsPerContentIntegerNo25Cap reviews saved per content item.
maxImagesPerContentIntegerNo20Cap saved image entries per content item.
collectCollectionsBooleanNofalseInclude collection details for movie records.
proxyConfigurationObjectNoApify ProxyOptional proxy configuration.
metamorphStringNoβ€”Optional actor ID to metamorph into after completion.

Output Data

Each dataset item is a consolidated record.

FieldTypeDescription
data_typeStringRecord type, e.g. content or person.
sourceStringSource mode used for extraction.
auth_modeStringAuthentication mode used for the run.
content_typeStringmovie or tv for content records.
query_modeStringsearch, discover, or search_person.
search_queryStringQuery string that produced the record (when applicable).
tmdb_idIntegerTMDb ID for movie/TV records.
person_idIntegerTMDb ID for person records.
title / nameStringMain title/name of the item.
overview / biographyStringDescription text.
vote_averageNumberAverage vote score (when available).
vote_countIntegerTotal vote count (when available).
genresArrayGenre objects or IDs depending on response block.
creditsObjectCast and crew details when enabled.
reviewsObjectReviews block when enabled.
keywordsObjectKeywords block when enabled.
imagesObjectPosters/backdrops/logos/profile image metadata when enabled.
videosObjectVideo metadata when enabled.
watch/providersObjectProvider availability by region when enabled.
recommendationsObjectRecommended related titles.
similarObjectSimilar related titles.
external_idsObjectExternal identity mappings (when available).
collection_detailsObjectCollection metadata for movies when enabled.
missing_data_fieldsArrayFields the source did not provide for that record, omitted when everything requested was available.
fetchedAtStringISO timestamp of extraction.

Usage Examples

Basic TV Discovery

{
"contentType":"tv",
"resultsWanted":20,
"maxPages":2
}

Movie Search with Rich Metadata

{
"contentType":"movie",
"searchQueries":"Inception,Interstellar,Dune",
"resultsWanted":10,
"collectPeople":true,
"collectReviews":true,
"collectImages":true,
"collectKeywords":true,
"collectVideos":true,
"collectWatchProviders":true
}

People Research Run

{
"contentType":"person",
"searchQueries":"Leonardo DiCaprio,Emma Stone",
"peopleResultsWanted":5,
"maxPages":2
}

Filtered Discovery by Genre and Votes

{
"contentType":"both",
"genreIds":"28,878",
"sortBy":"popularity.desc",
"minVoteCount":1000,
"minVoteAverage":7.0,
"resultsWanted":25,
"maxPages":3
}

Sample Output

{
"data_type":"content",
"source":"tmdb_public_api",
"auth_mode":"public-no-key",
"content_type":"movie",
"query_mode":"search",
"search_query":"Inception",
"tmdb_id":27205,
"title":"Inception",
"overview":"Cobb, a skilled thief who commits corporate espionage by infiltrating the subconscious of his targets...",
"release_date":"2010-07-15",
"vote_average":8.369,
"vote_count":38000,
"genres":[
{"id":28,"name":"Action"},
{"id":878,"name":"Science Fiction"}
],
"credits":{
"cast":[
{"id":6193,"name":"Leonardo DiCaprio","character":"Cobb"}
]
},
"videos":{
"results":[
{"type":"Trailer","site":"YouTube","key":"YoHD9XEInc0"}
]
},
"recommendations":{
"results":[
{"id":157336,"title":"Interstellar"}
]
},
"fetchedAt":"2026-03-14T12:00:00.000Z"
}

Tips For Best Results

Start Small, Then Scale

  • Use resultsWanted between 5 and 20 for quick validation.
  • Increase pages and limits after confirming output quality.

Use Search For Precision

  • Prefer searchQueries when you need exact titles or names.
  • Use discovery mode when you need broad market snapshots.

Keep High-Signal Filters

  • Use minVoteCount with minVoteAverage to reduce low-signal titles.
  • Pair filters with sortBy for consistent ranking.

Tune Runtime Stability

  • Start with the default concurrency and increase only when your workload stays stable.
  • Keep optional enrichments focused on the fields you actually need.

Enable Rich Blocks Only When Needed

  • Turn on reviews, videos, and provider blocks only for use cases that need them.
  • This keeps runs faster and payloads lighter.

Integrations

Connect dataset output to:

  • Google Sheets β€” Build analyst-friendly reports.
  • Airtable β€” Create searchable media intelligence tables.
  • Make β€” Trigger no-code automations.
  • Zapier β€” Pipe updates to workflows.
  • Webhooks β€” Forward results to your backend.
  • Slack β€” Notify teams on fresh runs.

Export Formats

  • JSON β€” Best for apps and pipelines.
  • CSV β€” Spreadsheet and BI workflows.
  • Excel β€” Business reporting.
  • XML β€” Legacy system integrations.

Frequently Asked Questions

Why do I no longer get multiple rows for one title?

The actor now emits consolidated records, so each title/person is represented once with rich nested blocks.

Can I run without a TMDb API key?

Yes. The actor can still collect rich records without user-supplied credentials.

How do I reduce null fields in output?

Null, undefined, and empty string values are automatically removed before saving.

How do I avoid duplicate records across pages?

Deduplication by TMDb ID is built in for both content and people records, including retry scenarios.

Can I collect movies and TV in the same run?

Yes, set contentType to both.

Can I chain this actor to another actor?

Yes, provide metamorph with the target actor ID.


Support

For issues or feature requests, use the actor issue/support channels in Apify Console.

Resources


Legal Notice

This actor is designed for legitimate data collection and analysis workflows. You are responsible for complying with all applicable terms and laws for your use case.

You might also like

Themoviedb Category Scraper

getdataforme/themoviedb-category-scraper

Themoviedb Category Scraper extracts movie data from TMDB via keyword searches, providing structured JSON with titles, release dates, overviews, and URLs....

TheMovieDB Scraper – Cheap πŸŽ¬πŸŒπŸ”

scrapestorm/themoviedb-scraper---cheap

Easily collect movies, TV shows, people, and production data from TheMovieDB.org, the world’s largest open movie database. Extract structured information including titles, release dates, countries of origin, roles, networks, production companies, collections, awards, and more 🎬🌍✨

2

TMDB Movie & TV Metadata Scraper

jungle_synthesizer/tmdb-movie-tv-metadata-scraper

Scrape rich metadata for movies and TV shows from The Movie Database (TMDB) β€” no API key required. Discovers titles from public browse pages and extracts full detail records including cast, directors, genres, keywords, ratings, runtime, and production companies.

πŸ‘ User avatar

BowTiedRaccoon

2

Douban Movie Scraper β€” Ratings, Reviews & Hot Lists

sian.agency/douban-movie-scraper

Scrape Douban (豆瓣甡影) into clean datasets β€” movie & TV ratings, cast and crew, long-form reviews, viewer comments with province geo, IMDb cross-IDs, and the live Recent Hot Movie & Hot TV trending lists. Six operations, one actor. No account or API key needed.

8

Twitter/X Hashtag Scraper: Support Sentiment&Tone Analyzer 2025

fastcrawler/twitter-x-hashtag-scraper-support-sentiment-tone-analyzer-2025

Get 1,000 results for just $0.01! Introducing the Twitter Hashtag Fast Scraper, your go-to solution for scraping Twitter hashtags. This powerful tool combines blazing-fast speed with advanced data extraction capabilities, making it perfect for social media analysts, marketers, and researchers.

475

1.0

IMDB Movie Scraper

getdataforme/imdb-movie-scraper

IMDB Scraper extracts all the details of the movie for which the detailed information are required for example, the rating of the movie, actors and all the details associated with the movie are fetched and presented in json or tabular format.

41

Related articles

How to scrape Google search results
Read more
Top 10 social media scrapers in 2026
Read more
How to download tweets from Twitter in 2026
Read more