VOOZH about

URL: https://apify.com/cloud9_ai/github-scraper

⇱ GitHub Repository Scraper - Developer & Open Source Data Β· Apify


Pricing

from $2.00 / 1,000 results

Go to Apify Store

GitHub Repository Scraper

Scrape GitHub repositories, users, and trending projects via REST API. Extract repo names, stars, forks, languages, descriptions, and contributor data.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

πŸ‘ cloud9

cloud9

Maintained by Community

Actor stats

0

Bookmarked

20

Total users

2

Monthly active users

2 months ago

Last modified

Categories

Share

GitHub Scraper

Scrape GitHub repositories, users, and trending projects via the GitHub REST API.

Features

  • 3 Scraping Modes:

    • searchRepos: Search for repositories by keywords
    • searchUsers: Search for GitHub users
    • trending: Get trending repositories (created in last 7 days, sorted by stars)
  • Filters:

    • Programming language filter
    • Sort options: stars, forks, updated, best-match
    • Configurable max results (1-100)
  • No Authentication Required: Uses GitHub's public REST API (60 requests/hour unauthenticated)

Input Parameters

FieldTypeRequiredDescription
modeStringYesScraping mode: searchRepos, searchUsers, or trending
searchQueryStringNoSearch keywords (not used for trending mode)
languageStringNoFilter by programming language (e.g., Python, JavaScript)
sortStringNoSort by: stars, forks, updated, or best-match
maxResultsIntegerNoMaximum results to scrape (1-100, default: 30)
proxyConfigurationObjectNoProxy settings for API requests

Example Input

Search Repositories

{
"mode":"searchRepos",
"searchQuery":"machine learning",
"language":"Python",
"sort":"stars",
"maxResults":30
}

Search Users

{
"mode":"searchUsers",
"searchQuery":"javascript developer",
"maxResults":20
}

Trending Repositories

{
"mode":"trending",
"language":"TypeScript",
"maxResults":50
}

Output Data

Repository Output

{
"name":"tensorflow",
"fullName":"tensorflow/tensorflow",
"description":"An Open Source Machine Learning Framework for Everyone",
"url":"https://github.com/tensorflow/tensorflow",
"stars":185000,
"forks":74000,
"language":"Python",
"topics":["machine-learning","deep-learning","tensorflow"],
"owner":"tensorflow",
"ownerUrl":"https://github.com/tensorflow",
"createdAt":"2015-11-07T01:19:20Z",
"updatedAt":"2024-02-14T12:34:56Z",
"openIssues":2500,
"watchers":185000,
"defaultBranch":"master",
"license":"Apache License 2.0",
"homepage":"https://www.tensorflow.org"
}

User Output

{
"login":"torvalds",
"url":"https://github.com/torvalds",
"avatarUrl":"https://avatars.githubusercontent.com/u/1024025",
"type":"User",
"publicRepos":6,
"followers":180000,
"following":0,
"createdAt":"2011-09-03T15:26:22Z",
"bio":"Creator of Linux and Git",
"company":"Linux Foundation",
"location":"Portland, OR",
"blog":"https://torvalds-family.blogspot.com"
}

Rate Limits

  • Unauthenticated API: 60 requests per hour
  • Rate Limit Handling: Automatic wait and retry if rate limit is hit
  • Request Delay: 1.5 seconds between requests to avoid hitting limits

Technical Details

  • Uses GitHub REST API v3
  • Built with Apify SDK 3.0 and Crawlee 3.0
  • TypeScript for type safety
  • gotScraping for HTTP requests with proxy support
  • Multi-stage Docker build for optimized image size

Local Development

# Install dependencies
npminstall
# Build TypeScript
npm run build
# Run locally
npm start

Deployment

Deploy to Apify platform:

$apify push

Use Cases

  • Repository discovery and analysis
  • Trending technology tracking
  • Developer community research
  • Open source project monitoring
  • Programming language popularity tracking

License

MIT

You might also like

GitHub repositories Scraper - Low-costπŸ’²πŸ”₯πŸ“¦πŸ™

delectable_incubator/github-repositories-scraper-low-cost

Scrape GitHub repositories πŸ“¦πŸ™ with a powerful developer data scraper. Extract repository names, descriptions, programming languages, stars, topics, forks, and repository URLs from any GitHub profile. Ideal for open-source analysis, developer scouting, technology research and market insights πŸ“ŠπŸš€

GitHub Scraper

automation-lab/github-scraper

Extract data from GitHub β€” repository details, developer profiles, trending repos, and search results. Stars, forks, languages, topics, and more. No API key needed.

πŸ‘ User avatar

Stas Persiianenko

37

GitHub Repository Scraper

vulnv/github-repository-scraper

Scrape and extract GitHub repository data, metadata, statistics, stars, forks, issues, and project information from multiple repositories at once.