VOOZH about

URL: https://apify.com/tzmyk/huggingface-models-scraper

โ‡ฑ HuggingFace Models Scraper ยท Apify


Pricing

from $2.00 / 1,000 model scrapeds

Go to Apify Store

HuggingFace Models Scraper

Scrapes AI/ML models from HuggingFace (huggingface.co/models) via the official API. Extracts model ID, downloads, likes, task type, library, tags, and more. Supports search, author/org filter, pipeline tag filter, and sort order.

Pricing

from $2.00 / 1,000 model scrapeds

Rating

0.0

(0)

Developer

๐Ÿ‘ tzmyk

tzmyk

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 months ago

Last modified

Share

Scrape AI/ML models from HuggingFace โ€” the world's largest repository of open-source machine learning models.

Extracts structured data including model ID, download counts, likes, task type (pipeline tag), ML library, tags, gated status, and timestamps. Powered by the official HuggingFace API โ€” no web scraping, no rate-limit surprises.

What it does

  • Fetches models from the HuggingFace public API with full metadata
  • Supports filtering by keyword search, author/organization, task type, and library
  • Supports sorting by downloads, likes, or date
  • Paginates automatically up to your specified limit (up to 10,000 models)

Use cases

  • AI research: Track which models are trending by downloads or likes
  • Competitive intelligence: Monitor what models a specific organization has published
  • Dataset building: Collect model metadata for ML benchmarks or surveys
  • Lead generation: Find organizations actively publishing models in your domain
  • Content & newsletters: Curate the most popular or newest models by task type

Input

FieldTypeDefaultDescription
searchstringโ€”Keyword search to filter models
authorstringโ€”Filter by author or organization (e.g. meta-llama)
pipelineTagstringโ€”Filter by task type (e.g. text-generation, image-classification)
libraryNamestringโ€”Filter by ML library (e.g. transformers, diffusers)
sortselectdownloadsSort by: downloads, likes, createdAt, lastModified
maxModelsinteger100Max models to return (1โ€“10,000)

Example input

{
"search":"llama",
"pipelineTag":"text-generation",
"sort":"downloads",
"maxModels":50
}

Output

One record per model saved to the default dataset.

FieldTypeDescription
modelIdstringFull model ID (e.g. meta-llama/Llama-3.1-8B-Instruct)
authorstring|nullAuthor or organization name
downloadsnumber|nullTotal download count
likesnumber|nullLike count
pipelineTagstring|nullTask type (e.g. text-generation)
libraryNamestring|nullML library (e.g. transformers)
tagsstring[]All tags including datasets, licenses, frameworks
gatedboolean|nullWhether model access requires approval
createdAtstring|nullCreation date (ISO 8601)
lastModifiedstring|nullLast modified date (ISO 8601)
urlstringDirect URL to the model page
scrapedAtstringTimestamp when this record was scraped

Example output

{
"modelId":"sentence-transformers/all-MiniLM-L6-v2",
"author":"sentence-transformers",
"downloads":208493944,
"likes":4598,
"pipelineTag":"sentence-similarity",
"libraryName":"sentence-transformers",
"tags":["sentence-transformers","pytorch","onnx","license:apache-2.0"],
"gated":false,
"createdAt":"2022-03-02T23:29:05.000Z",
"lastModified":"2025-03-06T13:37:44.000Z",
"url":"https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2",
"scrapedAt":"2026-03-22T03:46:43.767Z"
}

Features

  • Official API โ€” Uses the HuggingFace REST API directly; no fragile HTML parsing
  • Automatic pagination โ€” Fetches all pages until your limit is reached
  • Polite rate limiting โ€” 500ms delay between API calls
  • Robust input validation โ€” Clear error messages for invalid inputs

Notes

  • Results are limited to public models only; private models are not accessible
  • The gated field indicates whether a model requires access approval from the author
  • HuggingFace API does not support combining search with all sort orders equally; downloads sort works best for broad searches
  • Download counts are 30-day rolling totals as reported by HuggingFace

Support

Found a bug or have a feature request? Please open an issue or contact the author through the Apify platform.

You might also like

Huggingface Models

david_flagg/huggingface-models

Scrape model metadata from HuggingFace Hub โ€” the largest open-source ML model registry. Get downloads, likes, trending scores, licenses, tags, and architecture info for 1M+ models. Filter by task type, ML library, or author. Uses the official HF API โ€” no auth required.

HuggingFace Scraper โ€” Models, Datasets & Spaces

devilscrapes/huggingface-hub-scraper

Export models, datasets, and Spaces from the HuggingFace Hub API โ€” filter by task, library, or author, with a trending snapshot mode โ€” to JSON or CSV. Richer schema than incumbents: downloads, likes, tags, license, last-modified. No login.

HuggingFace Model Scraper - AI/ML Model Data

jungle_synthesizer/huggingface-model-scraper

Scrape AI/ML model metadata from the HuggingFace Hub. Extract model names, task types, download counts, likes, libraries, authors, tags, licenses, model sizes, and model card excerpts. Filter by task type, library, author, and search query.

๐Ÿ‘ User avatar

BowTiedRaccoon

2

Huggingface Ai Scraper

skystone_labs/huggingface-ai-scraper

Extract AI/ML models, datasets, and spaces from Hugging Face with comprehensive metadata. Get download counts, likes, tags, task categories, library frameworks, and author information. Perfect for AI researchers, ML engineers, and data scientists tracking the open-source AI ecosystem.

Ai-ML-scraper

labrat011/ai-ml-scraper

Search AI/ML models, research papers, and trending papers from HuggingFace Hub and arXiv. No API key required.