VOOZH about

URL: https://apify.com/parseforge/hugging-face-model-scraper

⇱ Hugging Face Scraper - Model Hub Listings Β· Apify


Pricing

Pay per event

Go to Apify Store

Hugging Face Model Scraper

Collect models from Hugging Face Hub via public API endpoints. Get metadata including author, downloads, likes, lastModified, task, library, license, tags and filenames.

Pricing

Pay per event

Rating

5.0

(3)

Developer

πŸ‘ ParseForge

ParseForge

Maintained by Community

Actor stats

2

Bookmarked

28

Total users

6

Monthly active users

24 days ago

Last modified

Share

πŸ‘ ParseForge Banner

πŸ€– Hugging Face Model Scraper

πŸš€ Collect AI model data from Hugging Face Hub in minutes. Search by keyword, task, library, license, or language. Export model names, downloads, likes, tags, and metadata. No coding, no Hugging Face account required.

πŸ•’ Last updated: 2026-04-23 Β· πŸ“Š 20+ fields per model Β· πŸ” 5 search filters Β· πŸ“Š Download + like counts Β· 🚫 No auth required

The Hugging Face Model Scraper collects model metadata from the Hugging Face Hub, returning 20+ fields per model: model ID, author, task type, library (PyTorch, TensorFlow, JAX), license, downloads, likes, tags, languages, pipeline tag, and model card URL. Supports keyword search with task, library, license, and language filters.

Hugging Face hosts over 800,000 AI models. This Actor queries the Hub and returns structured data ready for model scouting, benchmarking, or research dashboards.

🎯 Target AudienceπŸ’‘ Primary Use Cases
ML engineers, AI researchers, data scientists, MLOps teams, AI product managers, VC analystsModel scouting, benchmarking, competitive analysis, library adoption tracking, license auditing

πŸ“‹ What the Hugging Face Model Scraper does

Five search filters:

  • πŸ” Keyword search. Free-text search across model names and cards.
  • 🎯 Task filter. Text generation, image classification, translation, summarization, and more.
  • πŸ“š Library filter. PyTorch, TensorFlow, JAX, ONNX, Flax, etc.
  • πŸ“œ License filter. MIT, Apache 2.0, proprietary, custom licenses.
  • 🌐 Language filter. English, Chinese, multilingual, etc.

Each model record includes model ID, author, task, library, license, downloads, likes, tags, languages, pipeline tag, last modified date, and model card URL.

πŸ’‘ Why it matters: browsing Hugging Face for model comparisons means clicking through hundreds of model cards. This Actor exports structured model metadata at scale for ML benchmarking, scouting, and ecosystem analysis.


🎬 Full Demo

🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset.


βš™οΈ Input

InputTypeDefaultBehavior
querystring""Search across model names and cards.
taskstring""Task type: text-generation, image-classification, etc.
librarystring""ML library: transformers, diffusers, ONNX, etc.
licensestring""License: MIT, Apache 2.0, etc.
languagestring""Model language: en, zh, multilingual.

Example: text generation models with Apache 2.0 license.

{
"query":"llama",
"task":"text-generation",
"license":"apache-2.0"
}

Example: image classification models in PyTorch.

{
"task":"image-classification",
"library":"transformers"
}

⚠️ Good to Know: Hugging Face model downloads can change rapidly. Each run captures a point-in-time snapshot of the Hub's metadata.


πŸ“Š Output

Each model record contains 20+ fields. Download the dataset as CSV, Excel, JSON, or XML.

🧾 Schema

FieldTypeExample
πŸ€– modelIdstring"meta-llama/Llama-3-8B"
πŸ‘€ authorstring"meta-llama"
🎯 taskstring"text-generation"
πŸ“š librarystring"transformers"
πŸ“œ licensestring"llama3"
πŸ“Š downloadsnumber5000000
πŸ‘ likesnumber12000
🏷️ tagsarray["llm", "text-generation"]
🌐 languagesarray["en"]
🏷️ pipelineTagstring"text-generation"
πŸ“… lastModifiedstring"2026-03-15T10:30:00Z"
πŸ“‚ modelCardUrlstring"https://huggingface.co/meta-llama/..."
πŸ”— urlstring"https://huggingface.co/meta-llama/Llama-3-8B"
πŸ•’ scrapedAtISO 8601"2026-04-16T00:00:00.000Z"

πŸ“¦ Sample records


✨ Why choose this Actor

Capability
πŸ€–800,000+ models. Full Hugging Face Hub coverage.
πŸ”5 search filters. Keyword, task, library, license, language.
πŸ“ŠPopularity metrics. Downloads and likes per model.
πŸ“œLicense data. License type per model for compliance auditing.
πŸ“šLibrary tracking. PyTorch, TensorFlow, JAX adoption data.
⚑Scalable. From single lookups to full task-type sweeps.
🚫No authentication. Public Hub API.

πŸ“Š Hugging Face hosts over 800,000 AI models. Structured access to this metadata powers every ML benchmarking, model scouting, and ecosystem analysis workflow.


πŸ“ˆ How it compares to alternatives

ApproachCostCoverageRefreshFiltersSetup
⭐ Hugging Face Model Scraper (this Actor)$5 free credit, then pay-per-useFull HubLive per runkeyword, task, library, license, language⚑ 2 min
Hugging Face Hub API (direct)Free with rate limitsFullReal-timeMany⏳ Hours (API client setup)
Manual Hub browsingFreeManualManualUI onlyπŸ•’ Hours per category
Paid AI model databases$100-1,000/monthMulti-sourceVariesMany🐒 Days

Pick this Actor when you want Hugging Face model data on demand, with task and library filters, without writing API client code.


πŸš€ How to use

  1. πŸ“ Sign up. Create a free account with $5 credit (takes 2 minutes).
  2. 🌐 Open the Actor. Go to the Hugging Face Model Scraper page on the Apify Store.
  3. 🎯 Set input. Enter keywords, pick a task and library.
  4. πŸš€ Run it. Click Start and let the Actor collect your data.
  5. πŸ“₯ Download. Grab your results in the Dataset tab as CSV, Excel, JSON, or XML.

⏱️ Total time from signup to downloaded dataset: 3-5 minutes. No coding required.


πŸ’Ό Business use cases

πŸ€– ML Engineering & Research

  • Scout models for specific tasks
  • Compare download trends across architectures
  • Track new model releases by library
  • Audit license compliance for production use

πŸ“Š AI Market Intelligence

  • Track AI model ecosystem growth
  • Analyze library adoption rates (PyTorch vs TF)
  • Monitor competitor model releases
  • Build AI investment thesis datasets

🏒 MLOps & Platform Teams

  • Build internal model catalogs
  • Track community model updates
  • Monitor license changes across dependencies
  • Benchmark model sizes and performance

πŸ’Ό VC & Strategy Teams

  • Map the AI model landscape by task
  • Track emerging architectures and frameworks
  • Analyze open-source AI momentum
  • Build competitive maps of AI model builders

🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

πŸŽ“ Research and academia

  • Empirical datasets for papers, thesis work, and coursework
  • Longitudinal studies tracking changes across snapshots
  • Reproducible research with cited, versioned data pulls
  • Classroom exercises on data analysis and ethical scraping

🎨 Personal and creative

  • Side projects, portfolio demos, and indie app launches
  • Data visualizations, dashboards, and infographics
  • Content research for bloggers, YouTubers, and podcasters
  • Hobbyist collections and personal trackers

🀝 Non-profit and civic

  • Transparency reporting and accountability projects
  • Advocacy campaigns backed by public-interest data
  • Community-run databases for local issues
  • Investigative journalism on public records

πŸ§ͺ Experimentation

  • Prototype AI and machine-learning pipelines with real data
  • Validate product-market hypotheses before engineering spend
  • Train small domain-specific models on niche corpora
  • Test dashboard concepts with live input

πŸ”Œ Automating Hugging Face Model Scraper

Control the scraper programmatically for scheduled runs and pipeline integrations:

  • 🟒 Node.js. Install the apify-client NPM package.
  • 🐍 Python. Use the apify-client PyPI package.
  • πŸ“š See the Apify API documentation for full details.

The Apify Schedules feature lets you trigger this Actor on any cron interval. Weekly pulls keep your model intelligence database fresh.


πŸ€– Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:


❓ Frequently Asked Questions

🧩 How does it work?

Enter search terms and filters, click Start, and the Actor queries the Hugging Face Hub, returning one structured record per model.

πŸ“ How accurate is the data?

Data comes from Hugging Face's public Hub API. Downloads, likes, and metadata reflect the current state of the Hub at the time of the run.

πŸ“Š Does it include download counts?

Yes. Each model record includes total downloads and likes as reported by the Hub.

πŸ“œ Can I filter by license?

Yes. Use the license filter to restrict results to MIT, Apache 2.0, or any other license type.

🎯 Which tasks are supported?

All Hugging Face pipeline tasks: text-generation, image-classification, translation, summarization, question-answering, token-classification, and dozens more.

⏰ Can I schedule regular runs?

Yes. Use Apify Schedules to run this Actor weekly and track model ecosystem changes over time.

βš–οΈ Is this data legal to use?

The Actor uses the public Hugging Face Hub API. Review their terms of service for your specific use case.

πŸ’Ό Can I use this data commercially?

Yes, for analytics, research, and internal dashboards. The metadata is public. Individual model licenses vary.

πŸ’³ Do I need a paid Apify plan to use this Actor?

No. The free Apify plan is enough for testing. A paid plan lifts the item limit.

πŸ” What happens if a run fails or gets interrupted?

Apify automatically retries transient errors. Partial datasets from failed runs are preserved.

πŸ†˜ What if I need help?

Our support team is here to help. Contact us through the Apify platform or use the Tally form linked below.


πŸ”Œ Integrate with any app

Hugging Face Model Scraper connects to any cloud service via Apify integrations:

  • Make - Automate multi-step workflows
  • Zapier - Connect with 5,000+ apps
  • Slack - Get run notifications in your channels
  • Airbyte - Pipe model data into your warehouse
  • GitHub - Trigger runs from commits and releases
  • Google Drive - Export datasets straight to Sheets

You can also use webhooks to trigger downstream actions when a run finishes. Push fresh model data into your ML platform, or alert your team in Slack.


πŸ”— Recommended Actors

πŸ’‘ Pro Tip: browse the complete ParseForge collection for more AI and data scrapers.


πŸ†˜ Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.


⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Hugging Face, Inc. All trademarks mentioned are the property of their respective owners. Only publicly available Hub metadata is collected.

You might also like

Yad2 Apartments Scraper

amit123/YadScraper

I built a Yad2 scraper that automatically collects real estate listings, including price, location, rooms, descriptions, images and more. It structures the data for easy use in alert systems or property search apps.

Madlan - Israel Real Estate Listings

swerve/madlan-scraper

Scrape Madlan.co.il property listings in Israel. Rent & buy across 127+ cities. Returns prices, rooms, area, photos, amenities, property type, and agent contact info.

Israel Government Data API

lentic_clockss/israel-data-search

Search Israel company, contractor, health, and procurement datasets in one run. Get structured Israel records fast.

Yad2 Real Estate - Israel Property Listings

swerve/yad2-scraper

Scrape Yad2.co.il real estate listings across 127+ Israeli cities. Get prices, rooms, area, amenities, photos, descriptions, and agent contact info for rent and buy. Pay per result.

Yad2 Scraper

solidcode/yad2-scraper

[πŸ’° $4.0 / 1K] Extract listings from Yad2, Israel's largest classifieds site. Real estate (rent, sale, commercial), vehicles, and second-hand goods. Search by category and location or paste Yad2 URLs directly. Returns price, location, photos, agent details, and category-specific attributes.

Hugging Face Models Scraper β€” Search, Downloads, Likes, Tags

seemuapps/huggingface-models-scraper

Search Hugging Face for models by task, tag, or keyword and export downloads, likes, library, license, and tags to a clean dataset.

Drushim Scraper - Israel Job Listings

blackfalcondata/drushim-scraper

Scrape drushim.co.il, Israel’s leading Hebrew job board, for structured listings. Per-listing geo-coordinates, multi-filter search by category and experience, and incremental new/changed detection.

πŸ‘ User avatar

Black Falcon Data

52

5.0

Hugging Face Models Scraper

gio21/huggingface-models-scraper

Search and scrape Hugging Face models by task, library, or query. Returns id, downloads, likes, pipeline_tag, library_name, tags, last modified. Pay per model returned.

Facebook Followers Scraper

easyapi/facebook-followers-scraper

Extract Facebook followers data including profile details, friendship status, and basic information. Perfect for social media analysis, lead generation, and audience research.

1.1K

3.1