![]() |
VOOZH | about |
dotnet add package WebLookup --version 0.2.1
NuGet\Install-Package WebLookup -Version 0.2.1
<PackageReference Include="WebLookup" Version="0.2.1" />
<PackageVersion Include="WebLookup" Version="0.2.1" />Directory.Packages.props
<PackageReference Include="WebLookup" />Project file
paket add WebLookup --version 0.2.1
#r "nuget: WebLookup, 0.2.1"
#:package WebLookup@0.2.1
#addin nuget:?package=WebLookup&version=0.2.1Install as a Cake Addin
#tool nuget:?package=WebLookup&version=0.2.1Install as a Cake Tool
👁 NuGet
👁 NuGet Downloads
👁 Build
👁 License: MIT
A lightweight .NET library for fast URL discovery across multiple search providers with built-in rate limiting, automatic fallback, and site exploration.
WebLookup is a URL search engine, not a content parser. It collects URLs and metadata (title, description) from search APIs and sitemaps, then hands them off to your crawler or parser of choice.
robots.txt rules and sitemap.xml hierarchiesSystem.Net.Http and System.Text.Json; optional DI integrationMicrosoft.Extensions.DependencyInjection supportdotnet add package WebLookup
using WebLookup;
// DuckDuckGo requires no API key
var provider = new DuckDuckGoSearchProvider();
var results = await provider.SearchAsync("dotnet web search", count: 5);
using WebLookup;
var client = new WebSearchClient(
new DuckDuckGoSearchProvider(), // No API key needed
new GoogleSearchProvider(new() { Engines = [new() { ApiKey = "...", Cx = "..." }] }),
new MojeekSearchProvider(new() { ApiKey = "..." }),
new SearchApiProvider(new() { ApiKey = "..." }),
new TavilySearchProvider(new() { ApiKey = "..." })
);
var results = await client.SearchAsync("dotnet web search library");
foreach (var item in results)
{
Console.WriteLine($"[{item.Provider}] {item.Title}");
Console.WriteLine($" {item.Url}");
Console.WriteLine($" {item.Description}");
}
Results are deduplicated by URL. Providers run in parallel. If one provider hits a rate limit, results from others are still returned.
SearchAsync returns IReadOnlyList<SearchResult>:
[DuckDuckGo] Apache Lucene.NET is a powerful open source .NET search library
https://lucenenet.apache.org/
Apache Lucene.Net is a .NET full-text search engine framework...
[DuckDuckGo] GitHub - apache/lucenenet: Apache Lucene.NET
https://github.com/apache/lucenenet
Apache Lucene.Net is a high performance search library for .NET...
[Google] WebLookup - NuGet Gallery
https://www.nuget.org/packages/WebLookup
A lightweight .NET library for fast URL discovery...
[Tavily] Azure Cognitive Search Documentation
https://learn.microsoft.com/en-us/azure/search/
Cloud search service with built-in AI capabilities...
Each SearchResult contains:
| Field | Type | Description |
|---|---|---|
Url |
string |
Deduplicated absolute URL |
Title |
string |
Page title |
Description |
string? |
Snippet or summary (may be null) |
Provider |
string? |
Source provider name ("DuckDuckGo", "Google", "Tavily", etc.) |
When using WebSearchClient with multiple providers, results are deduplicated by URL (case-insensitive, fragments and trailing slashes removed). The first provider to return a URL wins.
// Google Custom Search
var google = new GoogleSearchProvider(new()
{
Engines = [new() { ApiKey = "YOUR_API_KEY", Cx = "YOUR_CX" }]
});
var results = await google.SearchAsync("query", count: 5);
// Tavily
var tavily = new TavilySearchProvider(new() { ApiKey = "YOUR_API_KEY" });
var results = await tavily.SearchAsync("query", count: 5);
var explorer = new SiteExplorer();
// Read robots.txt
var robots = await explorer.GetRobotsAsync(new Uri("https://example.com"));
Console.WriteLine($"Crawl-Delay: {robots.CrawlDelay}");
Console.WriteLine($"Sitemaps: {string.Join(", ", robots.Sitemaps)}");
foreach (var rule in robots.Rules)
{
Console.WriteLine($"[{rule.UserAgent}] {rule.Type}: {rule.Path}");
}
// Read sitemap
var entries = await explorer.GetSitemapAsync(new Uri("https://example.com/sitemap.xml"));
foreach (var entry in entries)
{
Console.WriteLine($"{entry.Url} (modified: {entry.LastModified}, priority: {entry.Priority})");
}
// Stream large sitemaps
await foreach (var entry in explorer.StreamSitemapAsync(new Uri("https://example.com/sitemap.xml")))
{
Console.WriteLine(entry.Url);
}
var robots = await explorer.GetRobotsAsync(new Uri("https://example.com"));
// Check if a path is allowed for your bot
bool allowed = robots.IsAllowed("/admin/page", userAgent: "MyBot");
| Provider | Class | Auth | API Docs |
|---|---|---|---|
| DuckDuckGo | DuckDuckGoSearchProvider |
None | HTML Lite |
GoogleSearchProvider |
API Key + CX | Custom Search JSON API | |
| Mojeek | MojeekSearchProvider |
API Key | Mojeek Search API |
| SearchApi | SearchApiProvider |
API Key (Bearer) | SearchApi |
| Tavily | TavilySearchProvider |
API Key | Tavily |
Each provider handles rate limits automatically via a built-in RateLimitHandler:
Retry-After headersservices.AddWebLookup(options =>
{
options.AddDuckDuckGo(); // No API key needed
options.AddGoogle(g =>
{
g.AddEngine(config["Google:ApiKey"], config["Google:Cx"]);
});
options.AddMojeek(config["Mojeek:ApiKey"]);
options.AddSearchApi(config["SearchApi:ApiKey"]);
options.AddTavily(config["Tavily:ApiKey"]);
});
// Inject wherever needed
public class MyService(WebSearchClient search, SiteExplorer explorer) { }
public record SearchResult
{
public required string Url { get; init; }
public required string Title { get; init; }
public string? Description { get; init; }
public string? Provider { get; init; }
}
public interface ISearchProvider
{
string Name { get; }
Task<IReadOnlyList<SearchResult>> SearchAsync(
string query, int count = 10,
CancellationToken cancellationToken = default);
}
public record RobotsInfo
{
public IReadOnlyList<RobotsRule> Rules { get; init; }
public IReadOnlyList<string> Sitemaps { get; init; }
public TimeSpan? CrawlDelay { get; init; }
public bool IsAllowed(string path, string userAgent = "*");
}
public record SitemapEntry
{
public required string Url { get; init; }
public DateTimeOffset? LastModified { get; init; }
public string? ChangeFrequency { get; init; }
public double? Priority { get; init; }
}
Microsoft.Extensions.DependencyInjection.Abstractions (for DI integration only)MIT
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 net10.0 is compatible. net10.0-android net10.0-android was computed. net10.0-browser net10.0-browser was computed. net10.0-ios net10.0-ios was computed. net10.0-maccatalyst net10.0-maccatalyst was computed. net10.0-macos net10.0-macos was computed. net10.0-tvos net10.0-tvos was computed. net10.0-windows net10.0-windows was computed. |
Showing the top 2 NuGet packages that depend on WebLookup:
| Package | Downloads |
|---|---|
|
IronHive.Cli.Core
IronHive CLI Core - Agent loop, tools, session management, and provider integrations for building AI-powered CLI tools |
|
|
IronHive.Flux.WebLookup
WebLookup → WebFlux → FluxIndex RAG pipeline for discovering, processing, and indexing web content |
This package is not used by any popular GitHub repositories.