VOOZH about

URL: https://apify.com/crawlforge/github-repositry-scraper

โ‡ฑ Github Repositry Scraper ยท Apify


Pricing

from $0.01 / 1,000 results

Go to Apify Store

Github Repositry Scraper

Scrape GitHub repos by URL, search, or trending. Extract stars, forks, topics, languages, contributors & more. No login needed.

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

๐Ÿ‘ Amna Iftikhar

Amna Iftikhar

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

1

Monthly active users

3 months ago

Last modified

Categories

Share

GitHub Repository Scraper

Extract comprehensive data from GitHub repositories โ€” by direct URL, keyword search, or trending. No login, no API keys required.

Perfect for competitive analysis, lead generation, market research, AI training data, and developer tooling pipelines.


๐Ÿš€ 3 Modes โ€” Pick One

โš ๏ธ Each mode uses different input fields. Only fill in fields for the mode you choose.

ModeWhen to use it
reposYou already have specific GitHub URLs you want to scrape
searchYou want to discover repos by keyword or language
trendingYou want GitHub's trending repos right now

โš™๏ธ Input by Mode

Mode: repos โ€” Scrape specific repositories

{
"mode":"repos",
"repoUrls":[
"https://github.com/facebook/react",
"https://github.com/vercel/next.js"
],
"maxResults":10,
"includeReadme":false
}
FieldRequiredDescription
modeโœ…Set to "repos"
repoUrlsโœ…List of GitHub repo URLs to scrape
maxResultsoptionalMax repos to scrape (default: 10)
includeReadmeoptionalAlso fetch README content (default: false)

Mode: search โ€” Find repos by keyword

{
"mode":"search",
"searchQuery":"machine learning",
"searchLanguage":"Python",
"searchSort":"stars",
"maxResults":50,
"includeReadme":false
}
FieldRequiredDescription
modeโœ…Set to "search"
searchQueryโœ…Keywords to search (e.g. "web scraper")
searchLanguageoptionalFilter by language e.g. "Python", "JavaScript"
searchSortoptionalSort by "stars", "forks", or "updated" (default: "stars")
maxResultsoptionalMax repos to return, up to 300 (default: 10)
includeReadmeoptionalAlso fetch README content (default: false)

Mode: trending โ€” Get GitHub's trending repos

{
"mode":"trending",
"trendingLanguage":"python",
"trendingPeriod":"weekly",
"maxResults":25,
"includeReadme":false
}
FieldRequiredDescription
modeโœ…Set to "trending"
trendingLanguageoptionalFilter by language e.g. "python", "rust" โ€” leave empty for all
trendingPeriodoptional"daily", "weekly", or "monthly" (default: "daily")
maxResultsoptionalMax repos to return (default: 10)
includeReadmeoptionalAlso fetch README content (default: false)

๐Ÿ“ฆ Output Fields

Each scraped repository returns:

{
"url":"https://github.com/facebook/react",
"fullName":"facebook/react",
"owner":"facebook",
"name":"react",
"repoId":"10270250",
"description":"The library for web and native user interfaces.",
"website":"https://react.dev",
"topics":["react","javascript","library","ui","frontend"],
"primaryLanguage":"JavaScript",
"languages":{"JavaScript":"68.1%","TypeScript":"29.0%"},
"license":"MIT",
"stars":243937,
"starsDisplay":"244k",
"forks":50761,
"watchers":6700,
"openIssues":809,
"openPullRequests":355,
"commits":21425,
"contributors":1734,
"totalReleases":118,
"latestRelease":"19.2.4",
"defaultBranch":"main",
"lastCommitAt":"2026-01-26T18:29:43Z",
"scrapedAt":"2026-03-13T10:00:00.000Z"
}

Enable includeReadme: true to also get readmeText and readmeHtml fields โ€” useful for AI/LLM pipelines.


๐ŸŽฏ Use Cases

  • Market research โ€” Track star growth and activity across competing repos
  • Lead generation โ€” Find active contributors in a technology stack
  • AI training data โ€” Bulk-collect repo descriptions, READMEs, and topics
  • Investment research โ€” Monitor open-source adoption signals
  • Competitive intelligence โ€” Benchmark your repo vs competitors

๐Ÿ’ฐ Pricing

Pay Per Result โ€” you only pay for repos successfully scraped.

VolumeCost
10 repos~$0.02
100 repos~$0.20
1,000 repos~$2.00

โšก Performance

  • Uses Cheerio โ€” no heavy browser, very low compute cost
  • Up to 3 concurrent requests
  • ~50โ€“100 repos/minute
  • No proxies needed for normal volumes

โ“ FAQ

Can I use all input fields at once? No. Each mode uses its own fields. Set mode first, then only fill in fields for that mode โ€” other fields are ignored.

Does this require a GitHub account or API key? No. Scrapes only public GitHub data, no login needed.

Can I scrape private repos? No โ€” public repos only.

Can I schedule this to run daily? Yes. Use Apify's built-in scheduler with a cron expression.

Will I get blocked? Unlikely for normal volumes. The Actor uses proper headers and rate limiting. For 1000+ repos, enable Apify proxy.


Built with Apify SDK + Crawlee. Issues or feature requests? Leave a comment on the Actor page.

You might also like

GitHub Scraper

automation-lab/github-scraper

Extract data from GitHub โ€” repository details, developer profiles, trending repos, and search results. Stars, forks, languages, topics, and more. No API key needed.

๐Ÿ‘ User avatar

Stas Persiianenko

37

GitHub Repository Scraper - Stars, Topics, Trending

logiover/github-repository-scraper

Scrape GitHub repos by search query and export stars, topics, forks & license to CSV/JSON. GitHub data export without an API key - trending repos scraper.

GitHub Repos Scraper

gio21/github-repos-scraper

Search and scrape GitHub repositories. Extract stars, forks, language, license, topics, and more from the GitHub public API.

GitHub Scraper - Repos, Stars, Issues & Profiles

cryptosignals/github-scraper

Scrape GitHub repositories, profiles, and issues โ€” extract stars, forks, contributors, README, commit history, and topics. CSV/JSON output. No login.

27