Pricing
from $0.01 / 1,000 results
Github Repositry Scraper
Scrape GitHub repos by URL, search, or trending. Extract stars, forks, topics, languages, contributors & more. No login needed.
Pricing
from $0.01 / 1,000 results
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
3
Total users
1
Monthly active users
3 months ago
Last modified
Categories
Share
GitHub Repository Scraper
Extract comprehensive data from GitHub repositories โ by direct URL, keyword search, or trending. No login, no API keys required.
Perfect for competitive analysis, lead generation, market research, AI training data, and developer tooling pipelines.
๐ 3 Modes โ Pick One
โ ๏ธ Each mode uses different input fields. Only fill in fields for the mode you choose.
| Mode | When to use it |
|---|---|
repos | You already have specific GitHub URLs you want to scrape |
search | You want to discover repos by keyword or language |
trending | You want GitHub's trending repos right now |
โ๏ธ Input by Mode
Mode: repos โ Scrape specific repositories
{"mode":"repos","repoUrls":["https://github.com/facebook/react","https://github.com/vercel/next.js"],"maxResults":10,"includeReadme":false}
| Field | Required | Description |
|---|---|---|
mode | โ | Set to "repos" |
repoUrls | โ | List of GitHub repo URLs to scrape |
maxResults | optional | Max repos to scrape (default: 10) |
includeReadme | optional | Also fetch README content (default: false) |
Mode: search โ Find repos by keyword
{"mode":"search","searchQuery":"machine learning","searchLanguage":"Python","searchSort":"stars","maxResults":50,"includeReadme":false}
| Field | Required | Description |
|---|---|---|
mode | โ | Set to "search" |
searchQuery | โ | Keywords to search (e.g. "web scraper") |
searchLanguage | optional | Filter by language e.g. "Python", "JavaScript" |
searchSort | optional | Sort by "stars", "forks", or "updated" (default: "stars") |
maxResults | optional | Max repos to return, up to 300 (default: 10) |
includeReadme | optional | Also fetch README content (default: false) |
Mode: trending โ Get GitHub's trending repos
{"mode":"trending","trendingLanguage":"python","trendingPeriod":"weekly","maxResults":25,"includeReadme":false}
| Field | Required | Description |
|---|---|---|
mode | โ | Set to "trending" |
trendingLanguage | optional | Filter by language e.g. "python", "rust" โ leave empty for all |
trendingPeriod | optional | "daily", "weekly", or "monthly" (default: "daily") |
maxResults | optional | Max repos to return (default: 10) |
includeReadme | optional | Also fetch README content (default: false) |
๐ฆ Output Fields
Each scraped repository returns:
{"url":"https://github.com/facebook/react","fullName":"facebook/react","owner":"facebook","name":"react","repoId":"10270250","description":"The library for web and native user interfaces.","website":"https://react.dev","topics":["react","javascript","library","ui","frontend"],"primaryLanguage":"JavaScript","languages":{"JavaScript":"68.1%","TypeScript":"29.0%"},"license":"MIT","stars":243937,"starsDisplay":"244k","forks":50761,"watchers":6700,"openIssues":809,"openPullRequests":355,"commits":21425,"contributors":1734,"totalReleases":118,"latestRelease":"19.2.4","defaultBranch":"main","lastCommitAt":"2026-01-26T18:29:43Z","scrapedAt":"2026-03-13T10:00:00.000Z"}
Enable includeReadme: true to also get readmeText and readmeHtml fields โ useful for AI/LLM pipelines.
๐ฏ Use Cases
- Market research โ Track star growth and activity across competing repos
- Lead generation โ Find active contributors in a technology stack
- AI training data โ Bulk-collect repo descriptions, READMEs, and topics
- Investment research โ Monitor open-source adoption signals
- Competitive intelligence โ Benchmark your repo vs competitors
๐ฐ Pricing
Pay Per Result โ you only pay for repos successfully scraped.
| Volume | Cost |
|---|---|
| 10 repos | ~$0.02 |
| 100 repos | ~$0.20 |
| 1,000 repos | ~$2.00 |
โก Performance
- Uses Cheerio โ no heavy browser, very low compute cost
- Up to 3 concurrent requests
- ~50โ100 repos/minute
- No proxies needed for normal volumes
โ FAQ
Can I use all input fields at once?
No. Each mode uses its own fields. Set mode first, then only fill in fields for that mode โ other fields are ignored.
Does this require a GitHub account or API key? No. Scrapes only public GitHub data, no login needed.
Can I scrape private repos? No โ public repos only.
Can I schedule this to run daily? Yes. Use Apify's built-in scheduler with a cron expression.
Will I get blocked? Unlikely for normal volumes. The Actor uses proper headers and rate limiting. For 1000+ repos, enable Apify proxy.
Built with Apify SDK + Crawlee. Issues or feature requests? Leave a comment on the Actor page.
