VOOZH about

URL: https://apify.com/brilliant_gum/huggingface-insights-scraper

⇱ Hugging Face Scraper β€” AI Models, Datasets, Spaces & Papers Β· Apify


πŸ‘ Hugging Face Insights Scraper β€” Models, Datasets & Spaces avatar

Hugging Face Insights Scraper β€” Models, Datasets & Spaces

Pricing

from $0.005 / model scraped

Go to Apify Store

Hugging Face Insights Scraper β€” Models, Datasets & Spaces

Scrape Hugging Face models, datasets, spaces, and daily papers with downloads, likes, parameters, tags, and growth tracking between runs. Filter by pipeline, library, author, or keyword.

Pricing

from $0.005 / model scraped

Rating

0.0

(0)

Developer

πŸ‘ Yuliia Kulakova

Yuliia Kulakova

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

5 days ago

Last modified

Categories

Share

Hugging Face Insights Scraper

Scrape AI models, datasets, Spaces, and daily research papers from Hugging Face β€” with downloads, likes, parameters, growth tracking, and smart filters.

πŸ‘ Hugging Face Insights Scraper


Why this scraper

Hugging Face is where AI happens β€” 1M+ models, 300K+ datasets, trending research papers every day. But the site gives you a search bar and infinite scroll. No way to bulk-export, no way to compare models by parameter count, no way to track which models are gaining traction this week vs. last.

This scraper turns Hugging Face into a structured intelligence feed. Filter by pipeline task, ML library, author, or keyword. Get model sizes, architecture details, and popularity analytics. Track download and like growth between scheduled runs. Export to CSV, JSON, or pipe directly into your dashboard.


What you get

Models β€” the full picture

  • Name, author, downloads, likes, pipeline task, ML library
  • Parameter count and size tier (tiny / small / medium / large / xlarge / massive)
  • Architecture details (LlamaForCausalLM, MistralForCausalLM, etc.)
  • License, language tags, base model, gated/private status
  • Inference status (warm/cold)
  • Popularity score, engagement ratio, downloads per day, model age

Datasets β€” structured metadata

  • Name, author, downloads, likes, license
  • Task categories (text-generation, question-answering, etc.)
  • Size category (1K–10K, 10K–100K, 100K–1M, etc.)
  • Language tags, creation date, last modified

Spaces β€” AI demos and apps

  • Name, author, likes, SDK (Gradio, Streamlit, Docker)
  • Runtime info, tags, creation date

Daily Papers β€” cutting-edge research

  • Title, full abstract, AI-generated summary and keywords
  • Authors, upvotes, comment count
  • GitHub repo link and star count
  • Arxiv URL, thumbnail, publication date

Smart filters β€” get exactly what you need

  • Filter by keyword, author/org, pipeline task, ML library
  • Minimum downloads and likes thresholds
  • Parameter range (e.g., only 1B–10B models)
  • Exclude gated or private items
  • Sort by downloads, likes, trending, recently created, or recently modified

Growth tracking between runs

  • Persistent snapshot store tracks downloads and likes over time
  • On subsequent runs: downloadsDelta, downloadsPerHour, likesDelta, trend (up/down/flat)
  • See which models are gaining or losing momentum
  • Perfect for scheduled monitoring of AI model trends

Detailed enrichment (optional)

  • Fetch full model details: exact parameter count, architectures, model type
  • Size tier classification: tiny (<500M) β†’ massive (100B+)
  • Popularity score combining downloads and community engagement
  • Downloads per day normalized by model age

Example use cases

  • AI researchers: Track trending models in your field, monitor new papers daily
  • ML engineers: Find the best model for your task β€” filter by pipeline, size, and popularity
  • Investors: Monitor which AI companies are gaining traction on Hugging Face
  • Data teams: Build a dataset catalog filtered by task, size, and license
  • Content creators: Track what's hot in AI this week for newsletters and reports
  • Competitive intelligence: Monitor specific orgs (OpenAI, Meta, Google) and their model releases

Input examples

Trending models right now:

{
"resourceType":"models",
"sort":"trending",
"maxResults":50
}

LLMs from Meta with full details:

{
"resourceType":"models",
"author":"meta-llama",
"pipeline_tag":"text-generation",
"sort":"downloads",
"maxResults":20,
"fetchDetails":true
}

Popular code datasets:

{
"resourceType":"datasets",
"search":"code",
"sort":"likes",
"minLikes":50,
"maxResults":30
}

Today's research papers:

{
"resourceType":"papers",
"maxResults":50
}

Image generation models with 10K+ downloads:

{
"resourceType":"models",
"pipeline_tag":"text-to-image",
"sort":"downloads",
"minDownloads":10000,
"maxResults":20
}

Output sample (model)

{
"type":"model",
"id":"meta-llama/Llama-3.1-8B-Instruct",
"author":"meta-llama",
"downloads":9980754,
"likes":6137,
"pipeline":"text-generation",
"library":"transformers",
"parameters":8030261248,
"sizeTier":"medium (3B-10B)",
"architectures":["LlamaForCausalLM"],
"modelType":"llama",
"license":"llama3.1",
"language":["en","de","fr","it","pt","hi","es","th"],
"popularityScore":3208,
"downloadsPerDay":14157,
"engagementRatio":61.49,
"ageDays":705,
"url":"https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct"
}

Integrations

Connect this scraper to any tool in your stack:

  • Google Sheets β€” auto-sync model rankings weekly
  • Slack / Discord β€” get alerts when a new trending model appears
  • Webhooks β€” trigger your pipeline when new data lands
  • API β€” fetch results programmatically from any language
  • Zapier / Make β€” connect to 5000+ apps without code

Cost

This actor uses pay-per-result pricing at $5.00 per 1,000 results ($0.005 per item). You only pay for the data you get β€” no platform usage fees on top.

Example runResultsCost
Top 50 trending models50$0.25
All meta-llama models with details~20$0.10
100 text-to-image models100$0.50
Today's research papers~50$0.25
1,000 most downloaded models1,000$5.00

Platform compute costs are minimal β€” a typical 100-item run finishes in under 10 seconds.


Limitations

  • Hugging Face API rate limit: 500 requests per 5 minutes (handled automatically with throttling)
  • Parameter count requires fetchDetails: true and is only available for models with safetensors weights
  • Papers endpoint returns daily papers only (no historical archive search)

You might also like

Hugging Face Scraper - Trending Models, Datasets & Spaces

arjunannamalai/huggingface-trending-scraper

Scrape trending, most-downloaded and most-liked Hugging Face models, datasets and spaces. Filter by author, task or keyword. No token required.

πŸ‘ User avatar

Arjun Annamalai

2

Hugging Face Scraper - Models, Datasets, Papers

logiover/huggingface-hub-intelligence-scraper

Hugging Face data export tool: scrape models, datasets & daily papers without a token. Export to CSV/JSON. A no-login Hugging Face API alternative.

Hugging Face Models Scraper β€” Search, Downloads, Likes, Tags

seemuapps/huggingface-models-scraper

Search Hugging Face for models by task, tag, or keyword and export downloads, likes, library, license, and tags to a clean dataset.

Hugging Face Hub API

alizarin_refrigerator-owner/hugging-face-hub

Access the Hugging Face Hub API to search & discover models, datasets & spaces. Search Models: Find ML models by name, task or library Search Datasets: Discover datasets for training & evaluation Search Spaces: Explore ML applications Get Metadata: Retrieve detailed repo information

HuggingFace Models Datasets Spaces Scraper - Low-costπŸ’²πŸ”₯πŸ€–πŸ€—

delectable_incubator/huggingface-models-datasets-spaces-scraper-low-cost

Scrape Hugging Face Models, Datasets & Spaces πŸ€–πŸ“Š with a powerful AI ecosystem scraper. Extract repository names, owners, tags, downloads, likes, update dates, source URLs and more from keyword searches. Ideal for AI research, model discovery, dataset analysis and machine learning intelligence πŸš€πŸŒ

Hugging Face Trending Scraper

funny_electrician/Korak1903

Hugging Face Trending Scraper: Tracks daily trending models and datasets to provide market intelligence.

πŸ‘ User avatar

Milton Gardener

2

Hugging Face Models Scraper

gio21/huggingface-models-scraper

Search and scrape Hugging Face models by task, library, or query. Returns id, downloads, likes, pipeline_tag, library_name, tags, last modified. Pay per model returned.

HuggingFace Hub Scraper

crawlerbros/huggingface-scraper

Scrape Hugging Face Hub, search and fetch models, datasets, and spaces with full metadata: downloads, likes, license, pipeline tag, library, tags, files, and more. Pure HTTP, no auth required.