![]() |
VOOZH | about |
dotnet add package EASYTools.GitHubCrawler --version 1.0.0
NuGet\Install-Package EASYTools.GitHubCrawler -Version 1.0.0
<PackageReference Include="EASYTools.GitHubCrawler" Version="1.0.0" />
<PackageVersion Include="EASYTools.GitHubCrawler" Version="1.0.0" />Directory.Packages.props
<PackageReference Include="EASYTools.GitHubCrawler" />Project file
paket add EASYTools.GitHubCrawler --version 1.0.0
#r "nuget: EASYTools.GitHubCrawler, 1.0.0"
#:package EASYTools.GitHubCrawler@1.0.0
#addin nuget:?package=EASYTools.GitHubCrawler&version=1.0.0Install as a Cake Addin
#tool nuget:?package=EASYTools.GitHubCrawler&version=1.0.0Install as a Cake Tool
GitHubCrawler is a lightweight C# library for recursively discovering and downloading files from GitHub repositories via the GitHub REST API v3. It provides simple asynchronous access to repository contents with support for cancellation, proper resource management, and modern .NET async streams.
CancellationToken for graceful terminationIDisposable for clean HttpClient disposaldotnet add package GitHubCrawler
Or via Package Manager:
Install-Package GitHubCrawler
using GitHubCrawler;
using System;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
static async Task Main(string[] args)
{
// Create crawler with optional GitHub token
using var crawler = new GitHubRepoCrawler("your-github-token");
// Enumerate all files in a repository
var cts = new CancellationTokenSource();
await foreach (var url in crawler.GetRepositoryContentsAsync(
"https://github.com/owner/repo",
cts.Token))
{
Console.WriteLine(url);
}
// Download a specific file
var file = await crawler.GetFileContentsAsync(
"https://raw.githubusercontent.com/owner/repo/main/file.txt",
cts.Token);
Console.WriteLine(Encoding.UTF8.GetString(file.Content));
}
public GitHubRepoCrawler(string token = null)
Creates a new crawler instance. Supply a personal access token for:
public async IAsyncEnumerable<string> GetRepositoryContentsAsync(
string gitUrl,
CancellationToken cancellationToken = default)
Recursively discovers all file download URLs in a repository.
Parameters:
gitUrl: Repository URL (supports multiple formats):
https://github.com/owner/repohttps://github.com/owner/repo.gitgit@github.com:owner/repo.gitcancellationToken: Optional cancellation tokenReturns: An async enumerable of raw file download URLs
Exceptions:
ArgumentException: Invalid repository URL formatObjectDisposedException: Crawler has been disposedOperationCanceledException: Operation was cancelledException: API errors (rate limits, network issues, etc.)public async Task<GitHubFileResponse> GetFileContentsAsync(
string url,
CancellationToken cancellationToken = default)
Downloads file content from a GitHub raw URL.
Parameters:
url: Raw file URL (e.g., from GetRepositoryContentsAsync)cancellationToken: Optional cancellation tokenReturns: GitHubFileResponse containing:
byte[] Content: Raw file bytesstring ContentType: MIME typeHttpStatusCode StatusCode: HTTP response statusUri FinalUrl: Final URL after redirectsDictionary<string, IEnumerable<string>> Headers: Response headersExceptions:
ArgumentException: URL is null or emptyObjectDisposedException: Crawler has been disposedOperationCanceledException: Operation was cancelledException: Download failedThe crawler implements IDisposable and should be used with a using statement:
using var crawler = new GitHubRepoCrawler(token);
// Use crawler...
// Automatically disposed when leaving scope
using var cts = new CancellationTokenSource();
// Cancel after 30 seconds
cts.CancelAfter(TimeSpan.FromSeconds(30));
// Or cancel on user input
Console.CancelKeyPress += (s, e) => {
e.Cancel = true;
cts.Cancel();
};
try
{
await foreach (var url in crawler.GetRepositoryContentsAsync(gitUrl, cts.Token))
{
Console.WriteLine(url);
}
}
catch (OperationCanceledException)
{
Console.WriteLine("Operation cancelled");
}
// Get only C# source files
await foreach (var url in crawler.GetRepositoryContentsAsync(gitUrl))
{
if (url.EndsWith(".cs"))
{
var file = await crawler.GetFileContentsAsync(url);
// Process C# file...
}
}
try
{
await foreach (var url in crawler.GetRepositoryContentsAsync(gitUrl))
{
Console.WriteLine(url);
}
}
catch (ArgumentException ex)
{
Console.WriteLine($"Invalid URL: {ex.Message}");
}
catch (Exception ex) when (ex.Message.Contains("rate limit"))
{
Console.WriteLine("GitHub API rate limit exceeded. Please authenticate or wait.");
}
catch (Exception ex)
{
Console.WriteLine($"Error: {ex.Message}");
}
int fileCount = 0;
await foreach (var url in crawler.GetRepositoryContentsAsync(gitUrl))
{
fileCount++;
Console.Write($"\rDiscovered {fileCount} files...");
}
Console.WriteLine($"\nTotal files: {fileCount}");
using statements| Authentication | Requests per Hour |
|---|---|
| None | 60 |
| Personal Access Token | 5,000 |
| GitHub App | 5,000-15,000 |
When rate limited, the API returns status code 403 with a "rate limit exceeded" message.
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the file for details.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net6.0 net6.0 is compatible. net6.0-android net6.0-android was computed. net6.0-ios net6.0-ios was computed. net6.0-maccatalyst net6.0-maccatalyst was computed. net6.0-macos net6.0-macos was computed. net6.0-tvos net6.0-tvos was computed. net6.0-windows net6.0-windows was computed. net7.0 net7.0 was computed. net7.0-android net7.0-android was computed. net7.0-ios net7.0-ios was computed. net7.0-maccatalyst net7.0-maccatalyst was computed. net7.0-macos net7.0-macos was computed. net7.0-tvos net7.0-tvos was computed. net7.0-windows net7.0-windows was computed. net8.0 net8.0 was computed. net8.0-android net8.0-android was computed. net8.0-browser net8.0-browser was computed. net8.0-ios net8.0-ios was computed. net8.0-maccatalyst net8.0-maccatalyst was computed. net8.0-macos net8.0-macos was computed. net8.0-tvos net8.0-tvos was computed. net8.0-windows net8.0-windows was computed. net9.0 net9.0 was computed. net9.0-android net9.0-android was computed. net9.0-browser net9.0-browser was computed. net9.0-ios net9.0-ios was computed. net9.0-maccatalyst net9.0-maccatalyst was computed. net9.0-macos net9.0-macos was computed. net9.0-tvos net9.0-tvos was computed. net9.0-windows net9.0-windows was computed. net10.0 net10.0 was computed. net10.0-android net10.0-android was computed. net10.0-browser net10.0-browser was computed. net10.0-ios net10.0-ios was computed. net10.0-maccatalyst net10.0-maccatalyst was computed. net10.0-macos net10.0-macos was computed. net10.0-tvos net10.0-tvos was computed. net10.0-windows net10.0-windows was computed. |
This package is not used by any NuGet packages.
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 1.0.0 | 121 | 3/29/2026 |
Initial release