VOOZH about

URL: https://apify.com/jungle_synthesizer/huggingface-model-scraper

⇱ HuggingFace Model Scraper - AI/ML Model Data Β· Apify


πŸ‘ HuggingFace Model Scraper - AI/ML Model Data avatar

HuggingFace Model Scraper - AI/ML Model Data

Pricing

Pay per event

Go to Apify Store

HuggingFace Model Scraper - AI/ML Model Data

Scrape AI/ML model metadata from the HuggingFace Hub. Extract model names, task types, download counts, likes, libraries, authors, tags, licenses, model sizes, and model card excerpts. Filter by task type, library, author, and search query.

Pricing

Pay per event

Rating

0.0

(0)

Developer

πŸ‘ BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

24 days ago

Last modified

Share

Extract comprehensive AI/ML model metadata from the HuggingFace Hub. The HuggingFace Hub hosts over 1 million public models and is the central repository for the AI/ML community. This actor queries the public HuggingFace API to retrieve model names, task types, download counts, popularity metrics, licenses, libraries, and model card excerpts.

What You Can Do

  • Browse top models sorted by total downloads, likes, trending score, or recently modified
  • Filter by task type (text-generation, image-classification, sentence-similarity, and 25+ other pipeline tags)
  • Filter by ML library (transformers, diffusers, sentence-transformers, GGUF, ONNX, and more)
  • Filter by author/organization (meta-llama, google, microsoft, BAAI, Qwen, etc.)
  • Search by keyword across model names and descriptions
  • Extract model card excerpts β€” first 500 characters of each model's README
  • Get spaces usage β€” count of HuggingFace Spaces using each model
  • Retrieve dataset provenance β€” datasets referenced in model card metadata

Use Cases

  • AI market intelligence β€” track which models are gaining downloads and likes
  • VC and investment research β€” monitor model ecosystem trends by organization
  • Enterprise model evaluation β€” shortlist foundation models by task type, license, and popularity
  • Competitive analysis β€” compare model adoption across ML libraries and providers
  • Dataset discovery β€” find which training datasets are most commonly used

Input Parameters

ParameterDescriptionDefault
searchQuerySearch across model names and descriptionsβ€”
pipelineTagFilter by task type (text-generation, image-classification, etc.)All tasks
libraryFilter by ML framework (transformers, diffusers, gguf, etc.)All libraries
authorFilter by author or organization usernameAll authors
sortBySort by downloads, likes, lastModified, or trendingdownloads
maxItemsMaximum number of records to return (0 = unlimited)10
proxyConfigurationOptional proxy settingsDisabled

Output Fields

Each record contains:

FieldTypeDescription
model_idstringFull model identifier (e.g., meta-llama/Llama-3.3-70B-Instruct)
model_namestringShort model name without the author prefix
pipeline_tagstringPrimary task type (text-generation, sentence-similarity, etc.)
downloads_totalintegerTotal all-time download count
downloads_30dintegerDownload count in the last 30 days (when available)
likesintegerNumber of likes on HuggingFace
librarystringPrimary ML library (transformers, diffusers, etc.)
authorstringModel author or organization username
tagsarrayTags including language, dataset references, and framework tags
licensestringLicense identifier (apache-2.0, mit, llama3.3, etc.)
model_size_paramsstringParameter count if encoded in tags (7B, 13B, 70B, etc.)
last_modifiedstringISO 8601 timestamp of last update
readme_excerptstringFirst 500 characters of the model card README
spaces_countintegerNumber of HuggingFace Spaces using this model
datasets_usedarrayDatasets referenced in model card metadata

Example Output

{
"model_id":"sentence-transformers/all-MiniLM-L6-v2",
"model_name":"all-MiniLM-L6-v2",
"pipeline_tag":"sentence-similarity",
"downloads_total":262278076,
"downloads_30d":null,
"likes":4833,
"library":"sentence-transformers",
"author":"sentence-transformers",
"tags":["sentence-transformers","pytorch","onnx","safetensors","bert","en"],
"license":"apache-2.0",
"model_size_params":null,
"last_modified":"2025-03-06T13:37:44.000Z",
"readme_excerpt":"# all-MiniLM-L6-v2\nThis is a sentence-transformers model...",
"spaces_count":100,
"datasets_used":["s2orc","ms_marco","gooaq","natural_questions"]
}

Technical Notes

  • No authentication required β€” uses the public HuggingFace Hub API
  • No proxy required β€” the API is publicly accessible without IP restrictions
  • Rate limits β€” generous unauthenticated limits; a courtesy 100ms delay is applied between detail fetches
  • Pagination β€” handles cursor-based pagination automatically, allowing retrieval of any number of models
  • Two-pass enrichment β€” basic metadata is retrieved from the list endpoint; detailed fields (readme_excerpt, spaces_count, datasets_used) are fetched from the model detail endpoint

Data Source

HuggingFace Hub API β€” https://huggingface.co/api/models

You might also like

Huggingface Models

david_flagg/huggingface-models

Scrape model metadata from HuggingFace Hub β€” the largest open-source ML model registry. Get downloads, likes, trending scores, licenses, tags, and architecture info for 1M+ models. Filter by task type, ML library, or author. Uses the official HF API β€” no auth required.

Hugging Face Models Scraper - Low-costπŸ’²πŸ”₯πŸ€–πŸ“Œ

delectable_incubator/hugging-face-models-scraper-low-cost

Scrape Hugging Face model listings πŸ€–πŸ“Š with a powerful AI model scraper. Extract model names, creators, downloads, likes, tags, update dates, model URLs, and popularity metrics from keyword searches. Ideal for AI research, model discovery, ecosystem monitoring and machine learning datasets πŸš€

Huggingface Ai Scraper

skystone_labs/huggingface-ai-scraper

Extract AI/ML models, datasets, and spaces from Hugging Face with comprehensive metadata. Get download counts, likes, tags, task categories, library frameworks, and author information. Perfect for AI researchers, ML engineers, and data scientists tracking the open-source AI ecosystem.

Hugging Face Models Scraper - Cheap πŸ€—πŸ€–πŸ”Ž

scrapestorm/hugging-face-models-scraper---cheap

🟠 Easily collect Models from Hugging Face Provide one or multiple search keywords and extract structured model data including model name, owner, likes, downloads, tags, last update date, match count & more πŸ€–πŸ“Š Perfect for AI model research, popularity tracking & model ecosystem monitoring πŸš€

2

5.0

Ai-ML-scraper

labrat011/ai-ml-scraper

Search AI/ML models, research papers, and trending papers from HuggingFace Hub and arXiv. No API key required.

ModelScope Model Catalog Scraper

jungle_synthesizer/modelscope-model-catalog-scraper

Scrape the ModelScope (modelscope.cn) AI model catalog β€” China's Alibaba-backed model hub. Export model IDs, tasks, frameworks, download stats, stars, licenses, and READMEs.

πŸ‘ User avatar

BowTiedRaccoon

2

Huggingface Discovery Parser Spider

getdataforme/huggingface-discovery-parser-spider

The Huggingface Discovery Parser Spider efficiently scrapes and parses data from the Hugging Face platform, extracting valuable AI model metadata like author details, descriptions, categories, and more....

Related articles

Python and machine learning
Read more
How to improve AI models with web scraping and data augmentation
Read more