👁 Openverse Open-License Media Scraper avatar

Openverse Open-License Media Scraper

Pricing

from $13.00 / 1,000 result items

Openverse Open-License Media Scraper

Search 800M+ openly licensed images, audio clips and graphics across Flickr, Wikimedia, Europeana, Smithsonian, NASA and 50+ CC and public-domain providers. Returns title, creator, license, attribution, source URL, file size, dimensions, tags and direct media URL. Filter by license or source.

Pricing

from $13.00 / 1,000 result items

Rating

0.0

(0)

Developer

👁 ParseForge

ParseForge

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

17 days ago

Last modified

🎨 Openverse Media Scraper

🚀 Search 800M+ openly licensed images, audio, and graphics across 50+ providers.

🕒 Last updated: 2026-05-06 · 📊 23 fields per record · 800M+ media records · CC and public-domain providers (Flickr, Wikimedia, Smithsonian, NASA, Europeana)

The Openverse Media Scraper searches WordPress.org's Openverse index of openly licensed media and returns structured records for images, audio clips, illustrations, and graphics. Every result is licensed under Creative Commons or in the public domain, with full attribution metadata.

The catalog aggregates 800M+ items across 50+ providers (Flickr, Wikimedia Commons, Europeana, Smithsonian, NASA, Bio Diversity Library, Rawpixel). Filters run server-side, so a single run can isolate CC0 sunsets, Smithsonian sketches, or NASA imagery only.

🎯 Target Audience	💡 Primary Use Cases
Content creators, designers, educators, marketing teams, journalists, app developers, AI training pipelines	Content libraries, blog illustrations, social media assets, AI training datasets, educational materials

📋 What the Openverse Media Scraper does

Five filtering workflows in a single run:

🔍 Keyword search. Match titles, descriptions, tags, and creator names across the catalog.
🏷️ License filter. Restrict by CC license (CC0, CC-BY, CC-BY-SA) or public domain.
📁 Source filter. Restrict to one provider.
📐 Aspect ratio. Tall, wide, or square (images only).
🎵 Media type toggle. Switch between images and audio.

💡 Why it matters: clean, server-side filtering removes the parser-and-pagination work from your team and keeps your dataset fresh on every run.

🎬 Full Demo

🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset.

⚙️ Input

Input	Type	Default	Behavior
maxItems	integer	10	Records to return. Free plan caps at 10, paid plan up to 1,000,000.
query	string	"sunset"	Free-text keyword search.
mediaType	string	"images"	`images` or `audio`.
license	string	""	License filter (cc0, by, by-sa, by-nc). Empty = any.
source	string	""	Provider filter. Empty = all.
aspectRatio	string	""	tall, wide, square (images only).

Example: 100 CC0 sunset images.

{
"maxItems":100,
"query":"sunset",
"mediaType":"images",
"license":"cc0"
}

Example: 500 NASA-sourced images.

{
"maxItems":500,
"mediaType":"images",
"source":"nasa"
}

📊 Output

Each record contains 23 fields. Download the dataset as CSV, Excel, JSON, or XML.

🧾 Schema

Field	Type	Example
🖼️ `thumbnailUrl`	string	`"https://api.openverse.org/v1/images/.../thumb/"`
🆔 `id`	string	`"1e97a259-..."`
📛 `title`	string	null
👤 `creator`	string	null
🌐 `url`	string	`"https://live.staticflickr.com/.../b.jpg"`
🌐 `sourceUrl`	string	`"https://www.flickr.com/photos/.../4994679"`
⚖️ `license`	string	`"cc-by-nc-sa"`
⚖️ `licenseVersion`	string	null
📁 `source`	string	`"flickr"`
📐 `width`	number	null
📐 `height`	number	null
🎵 `duration`	number	null
🏷️ `tags`	array	`["sunset","nature"]`
📋 `attribution`	string	`"Sunset by X (CC BY-NC-SA 2.0)"`

📦 Sample records

✨ Why choose this Actor

	Capability
⚖️	Verified open licenses. Every record carries explicit license + attribution; no copyright guessing.
🌐	50+ providers in one index. Flickr, Wikimedia, Europeana, Smithsonian, NASA in a single search.
🎵	Audio + images. Switch media type with one input flag.
⚡	Fast. 100 records in under 30 seconds.
🔄	Always fresh. Each run hits the live Openverse index.

📈 How it compares to alternatives

Approach	Cost	Coverage	Refresh	Filters	Setup
⭐ This Actor	$5 free credit	800M+ items	Live per run	license, source, type, aspect	⚡ 2 min
Unsplash/Pexels APIs	Free tier	Smaller curated	Live	Limited	⏳ Hours
Manual provider scraping	Free	Per-provider	Live	DIY	🐢 Days
Stock photo libraries	$30+/month	Curated	Live	Yes	🐢 Account setup

Pick this Actor when you want broad coverage, server-side filtering, and no pipeline maintenance.

🚀 How to use

📝 Sign up. Create a free account with $5 credit (takes 2 minutes).
🌐 Open the Actor. Go to the Openverse Media Scraper page on the Apify Store.
🎯 Set input. Pick your filters and maxItems.
🚀 Run it. Click Start and let the Actor collect your data.
📥 Download. Grab your results in the Dataset tab as CSV, Excel, JSON, or XML.

⏱️ Total time from signup to downloaded dataset: 3-5 minutes. No coding required.

💼 Business use cases

📰 Content & Editorial

Blog post imagery with proper attribution
Newsletter and social media graphics
Article hero images by topic
Author headshots and brand visuals

🎓 Education & Research

Lecture slides with verified attribution
Open educational resources (OER)
Research paper figures
Public-domain audio for narration

🤖 AI & ML

Training image classifiers with safe licenses
Captioning model datasets
Image embedding search corpora
Audio dataset generation

🎨 Design & Marketing

Mood boards and creative briefs
Marketing campaign assets
Brand collateral with clean licensing
Product placeholder imagery

🔌 Automating Openverse Media Scraper

Control the scraper programmatically for scheduled runs and pipeline integrations:

🟢 Node.js. Install the apify-client NPM package.
🐍 Python. Use the apify-client PyPI package.
📚 See the Apify API documentation for full details.

The Apify Schedules feature lets you trigger this Actor on any cron interval. Hourly, daily, or weekly refreshes keep downstream databases in sync automatically.

🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

🎓 Research and academia

Reproducible research figures
Open-license media audits
Cultural heritage dataset construction
Course material with attribution

🎨 Personal and creative

Personal blogs and portfolios
Indie game and app assets
DIY documentation
Newsletter and social-media content

🤝 Non-profit and civic

Public service campaign visuals
Civic literacy materials
OSM and open-data illustrations
Journalism with documented attribution

🧪 Experimentation

Train captioning models on safe data
Prototype attribution-aware UIs
Build licensed-only stock libraries
Test moderation pipelines

🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:

❓ Frequently Asked Questions

🧩 How does it work?

Provide a query, license, source, or aspect-ratio filter. The Actor queries the Openverse index and emits one record per media item.

⚖️ Is everything free to use commercially?

Most records are CC0 or CC-BY which permit commercial use with attribution. Always verify the specific license.

📊 How many fields per record?

23, including title, creator, license, source URL, dimensions, tags, attribution, and direct media URL.

🎵 Does it include audio?

Yes. Set mediaType to audio to search music, sound effects, and spoken-word recordings.

🔁 Can I schedule recurring runs?

Yes. Use Apify Schedules for content-pipeline refreshes.

🌐 Which providers are covered?

50+, including Flickr, Wikimedia Commons, Europeana, Smithsonian, NASA, Rawpixel, Bio Diversity Library.

🔄 How fresh is the index?

Openverse re-indexes providers continuously. Each run hits the latest snapshot.

💳 Do I need a paid Apify plan?

No. The free plan covers preview runs. A paid plan unlocks larger downloads and scheduling.

🆘 What if a run fails?

Apify retries transient errors. Inspect logs in the Runs tab; partial datasets are preserved.

📐 Can I filter by image dimensions?

Aspect ratio (tall/wide/square) is supported. Exact-dimension filtering happens client-side after download.

🔌 Integrate with any app

Openverse Media Scraper connects to any cloud service via Apify integrations:

Make - Automate multi-step workflows
Zapier - Connect with 5,000+ apps
Slack - Get run notifications in your channels
Airbyte - Pipe data into your warehouse
GitHub - Trigger runs from commits and releases
Google Drive - Export datasets straight to Sheets

You can also use webhooks to trigger downstream actions when a run finishes.

🔗 Recommended Actors

📚 Project Gutenberg Books - 75,000+ free public-domain books
📖 Open Library Books - 30M+ books and editions
🎨 Met Museum Scraper - Metropolitan Museum public-domain artworks
🌐 Wikidata Entity Search - 100M+ open knowledge-graph entities
🎬 TVMaze TV Shows - TV show metadata and episodes

💡 Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.

🆘 Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.

⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by WordPress.org, Openverse, or any of the upstream content providers. All trademarks mentioned are the property of their respective owners. Only publicly available open data is collected.

👁 OpenVerse Image Scraper avatar

OpenVerse Image Scraper

crawlerbros/pixabay-scraper

Search millions of Creative Commons licensed images from Flickr, Wikimedia, and museums via OpenVerse (api.openverse.org). Free, no API key required.

👁 User avatar

Crawler Bros

Realtor License Scraper - Real Estate License Verification

lulzasaur/realtor-license-scraper

Verify real estate agent and broker licenses in FL, CA, TX, NY. Search by license number or name. Returns status, expiration, license type.

👁 User avatar

lulz bot

Insurance License Scraper - Agent License Verification

lulzasaur/insurance-license-scraper

Scrape insurance agent/broker license data from state DOI databases. Supports FL, CA, TX. Verify license status, expiration dates, and license types for compliance and due diligence.

👁 User avatar

lulz bot

Colorado Professional License Scraper

velvety_bedbug/colorado-professional-license-scraper

Scrape the Colorado DORA professional and occupational license registry: licensee name, license number, profession type, status, city, and license dates — built for license verification, compliance, and lead generation.

👁 User avatar

Peters Bugs

Ny Real Estate License Scraper

flamboyant_liner/ny-real-estate-license-scraper

Scrape the New York State licensed real estate salespersons and brokers registry: licensee name, brokerage, license number, license type, full address, and expiration date — built for real-estate lead generation, recruiting, and license verification.

👁 User avatar

Khrystyna Skotte

State Insurance License Scraper

chimerical_quicklime/state-insurance-license-scraper

Scrape insurance producer license data from state regulators (Florida and more). License #, name, status, residency, license type. Export JSON, CSV, Excel.

👁 User avatar

Khrystyna Skotte

👁 Creative Commons Search Scraper (Openverse) avatar

Creative Commons Search Scraper (Openverse)

gio21/creative-commons-scraper

Search Creative Commons-licensed content via Openverse API. Get free-to-use images, audio with attribution.

👁 User avatar

Gio

👁 Florida Professional License Scraper avatar

Florida Professional License Scraper

scrapers_lat/florida-dbpr-scraper

Scrape Florida DBPR professional license records by name, business, or license number. Get licensee name, license number, profession, status, rank, county, address and expiration date.

👁 User avatar

Scrapers Lat

👁 Texas Department of Insurance License Scraper avatar

Texas Department of Insurance License Scraper

parseforge/tdi-texas-insurance-scraper

Extract Texas Department of Insurance license records: 950K+ agents and 55K+ agencies with license numbers, NPN, types, status, dates, and locations. Filter by license type, city, or ZIP.

👁 User avatar

ParseForge

👁 Colorado Professional License Scraper avatar

Colorado Professional License Scraper

haketa/colorado-professional-license-scraper

Colorado DORA professional license scraper & API: search licenses across boards and export license number, type, status, name, profession, address and issue/expiry dates. Professional license verification, compliance and lead generation — fast, no login.

👁 User avatar

Haketa

URL: https://apify.com/parseforge/openverse-media-scraper

⇱ Openverse Media Scraper (800M+ CC images and audio) · Apify

Openverse Open-License Media Scraper

🎨 Openverse Media Scraper

📋 What the Openverse Media Scraper does

🎬 Full Demo

⚙️ Input

📊 Output

🧾 Schema

📦 Sample records

✨ Why choose this Actor

📈 How it compares to alternatives

🚀 How to use

💼 Business use cases

📰 Content & Editorial

🎓 Education & Research

🤖 AI & ML

🎨 Design & Marketing

🔌 Automating Openverse Media Scraper

🌟 Beyond business use cases

🎓 Research and academia

🎨 Personal and creative

🤝 Non-profit and civic

🧪 Experimentation

🤖 Ask an AI assistant about this scraper

❓ Frequently Asked Questions

🧩 How does it work?

⚖️ Is everything free to use commercially?

📊 How many fields per record?

🎵 Does it include audio?

🔁 Can I schedule recurring runs?

🌐 Which providers are covered?

🔄 How fresh is the index?

💳 Do I need a paid Apify plan?

🆘 What if a run fails?

📐 Can I filter by image dimensions?

🔌 Integrate with any app

🔗 Recommended Actors

You might also like

OpenVerse Image Scraper

Realtor License Scraper - Real Estate License Verification

Insurance License Scraper - Agent License Verification

Colorado Professional License Scraper

Ny Real Estate License Scraper

State Insurance License Scraper

Creative Commons Search Scraper (Openverse)

Florida Professional License Scraper

Texas Department of Insurance License Scraper

Colorado Professional License Scraper