![]() |
VOOZH | about |
dotnet add package WebFlux --version 0.5.2
NuGet\Install-Package WebFlux -Version 0.5.2
<PackageReference Include="WebFlux" Version="0.5.2" />
<PackageVersion Include="WebFlux" Version="0.5.2" />Directory.Packages.props
<PackageReference Include="WebFlux" />Project file
paket add WebFlux --version 0.5.2
#r "nuget: WebFlux, 0.5.2"
#:package WebFlux@0.5.2
#addin nuget:?package=WebFlux&version=0.5.2Install as a Cake Addin
#tool nuget:?package=WebFlux&version=0.5.2Install as a Cake Tool
A .NET SDK for preprocessing web content for RAG (Retrieval-Augmented Generation) systems.
👁 NuGet Version
👁 NuGet Downloads
👁 .NET Support
👁 License
WebFlux processes web content into chunks optimized for RAG systems. It handles web crawling, content extraction, and intelligent chunking with support for multiple content formats.
dotnet add package WebFlux
using WebFlux;
using Microsoft.Extensions.DependencyInjection;
var services = new ServiceCollection();
// Register your AI service implementations
services.AddScoped<ITextEmbeddingService, YourEmbeddingService>();
services.AddScoped<ITextCompletionService, YourLLMService>(); // Optional
// Add WebFlux
services.AddWebFlux();
var provider = services.BuildServiceProvider();
var processor = provider.GetRequiredService<IWebContentProcessor>();
// Process a single URL
var chunks = await processor.ProcessUrlAsync("https://example.com");
foreach (var chunk in chunks)
{
Console.WriteLine($"Chunk {chunk.ChunkIndex}: {chunk.Content}");
}
// Or stream a whole website
await foreach (var chunk in processor.ProcessWebsiteAsync("https://example.com"))
{
Console.WriteLine($"Chunk {chunk.ChunkIndex}: {chunk.Content}");
}
| Strategy | Use Case |
|---|---|
| Auto | Automatically selects best strategy based on content |
| Smart | Structured HTML documentation |
| Semantic | General web pages and articles |
| Intelligent | Blogs and knowledge bases |
| MemoryOptimized | Large documents with memory constraints |
| Paragraph | Markdown with natural boundaries |
| FixedSize | Uniform chunks for testing |
| DomStructure | HTML DOM structure-based chunking preserving semantic boundaries |
WebFlux uses the Interface Provider pattern. You provide AI service implementations, and WebFlux handles crawling, extraction, and chunking.
Vector embedding generation for semantic chunking:
public interface ITextEmbeddingService
{
Task<float[]> GetEmbeddingAsync(string text, CancellationToken cancellationToken = default);
Task<IReadOnlyList<float[]>> GetEmbeddingsAsync(IReadOnlyList<string> texts, CancellationToken cancellationToken = default);
int MaxTokens { get; }
int EmbeddingDimension { get; }
}
LLM text completion for multimodal processing and content reconstruction:
public interface ITextCompletionService
{
Task<string> CompleteAsync(string prompt, TextCompletionOptions? options = null, CancellationToken cancellationToken = default);
IAsyncEnumerable<string> CompleteStreamAsync(string prompt, TextCompletionOptions? options = null, CancellationToken cancellationToken = default);
Task<bool> IsAvailableAsync(CancellationToken cancellationToken = default);
}
Image-to-text conversion for multimodal content:
public interface IImageToTextService
{
Task<string> ConvertImageToTextAsync(string imageUrl, ImageToTextOptions? options = null, CancellationToken cancellationToken = default);
Task<string> ExtractTextFromImageAsync(string imageUrl, CancellationToken cancellationToken = default);
Task<bool> IsAvailableAsync(CancellationToken cancellationToken = default);
}
The main entry point for all web content processing:
// Single URL processing
var chunks = await processor.ProcessUrlAsync("https://example.com");
// Website crawling (streaming)
await foreach (var chunk in processor.ProcessWebsiteAsync(url, crawlOptions, chunkOptions))
{
// Process chunk
}
// Batch processing
var results = await processor.ProcessUrlsBatchAsync(urls, chunkOptions);
For consumers that only need extraction or chunking:
// Extraction only
var extractor = provider.GetRequiredService<IContentExtractService>();
var result = await extractor.ExtractContentAsync("https://example.com");
// Chunking only
var chunker = provider.GetRequiredService<IContentChunkService>();
var chunks = await chunker.ProcessUrlAsync("https://example.com");
Implement custom chunking strategies:
public interface IChunkingStrategy
{
string Name { get; }
string Description { get; }
Task<IReadOnlyList<WebContentChunk>> ChunkAsync(ExtractedContent content, ChunkingOptions? options = null, CancellationToken cancellationToken = default);
}
Subscribe to pipeline events for monitoring, metrics collection, and observability.
IEventPublisher is automatically registered as a singleton when you call AddWebFlux().
using WebFlux.Core.Interfaces;
using WebFlux.Core.Models.Events;
var publisher = provider.GetRequiredService<IEventPublisher>();
// Subscribe to specific event types
using var s1 = publisher.Subscribe<PageCrawledEvent>(async e =>
{
Console.WriteLine($"Crawled {e.Url} [{e.StatusCode}] in {e.ProcessingTimeMs}ms");
await metrics.RecordPageCrawl(e);
});
using var s2 = publisher.Subscribe<ChunkGeneratedEvent>(e =>
{
Console.WriteLine($"Chunk #{e.SequenceNumber} ({e.ChunkSize} tokens) from {e.SourceUrl}");
});
using var s3 = publisher.Subscribe<ErrorOccurredEvent>(e =>
{
Console.WriteLine($"[{e.ErrorCategory}] {e.ErrorCode}: {e.Message}");
});
// Or subscribe to ALL events
using var sAll = publisher.SubscribeAll(async e =>
{
await logger.LogEventAsync(e.EventType, e);
});
Available event types (WebFlux.Core.Models.Events namespace):
| Category | Events |
|---|---|
| Pipeline | ProcessingStartedEvent, ProcessingProgressEvent, ProcessingCompletedEvent, ProcessingFailedEvent |
| Crawling | CrawlingStartedEvent, CrawlingCompletedEvent, PageCrawledEvent, UrlProcessingStartedEvent, UrlProcessedEvent, UrlProcessingFailedEvent |
| Extraction | ContentExtractionStartedEvent, ContentExtractionCompletedEvent, ContentExtractionFailedEvent, ImageProcessedEvent |
| Chunking | ChunkingStartedEvent, ChunkingCompletedEvent, ChunkGeneratedEvent |
| Monitoring | ErrorOccurredEvent, PerformanceMetricsEvent |
All events derive from ProcessingEvent (base class with EventId, EventType, Timestamp, Severity, CorrelationId).
For detailed implementation examples, see the .
var options = new CrawlOptions
{
MaxDepth = 3,
MaxPages = 100,
RespectRobotsTxt = true,
UserAgent = "MyBot/1.0"
};
var chunkOptions = new ChunkingOptions
{
Strategy = "Auto",
MaxChunkSize = 512,
OverlapSize = 64
};
await foreach (var chunk in processor.ProcessWebsiteAsync(url, options, chunkOptions))
{
// Handle chunk
}
MIT License - see LICENSE file for details.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 net10.0 is compatible. net10.0-android net10.0-android was computed. net10.0-browser net10.0-browser was computed. net10.0-ios net10.0-ios was computed. net10.0-maccatalyst net10.0-maccatalyst was computed. net10.0-macos net10.0-macos was computed. net10.0-tvos net10.0-tvos was computed. net10.0-windows net10.0-windows was computed. |
Showing the top 4 NuGet packages that depend on WebFlux:
| Package | Downloads |
|---|---|
|
FluxIndex.SDK
FluxIndex SDK - Complete RAG infrastructure with FileFlux integration, FluxCurator preprocessing, and FluxImprover quality enhancement. AI providers are externally injectable. |
|
|
IronHive.DeepResearch
Deep Research module - autonomous research agent system |
|
|
IronHive.Flux.Core
Core adapters bridging IronHive AI services to Flux ecosystem (FileFlux, WebFlux, FluxIndex) |
|
|
IronHive.Flux.WebLookup
WebLookup → WebFlux → FluxIndex RAG pipeline for discovering, processing, and indexing web content |
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 0.5.2 | 1,069 | 5/15/2026 |
| 0.5.1 | 721 | 4/14/2026 |
| 0.4.3 | 493 | 3/20/2026 |
| 0.4.2 | 970 | 2/25/2026 |
| 0.4.1 | 327 | 2/19/2026 |
| 0.4.0 | 161 | 2/13/2026 |
| 0.3.0 | 169 | 2/8/2026 |
| 0.2.1 | 121 | 2/5/2026 |
| 0.2.0 | 127 | 2/5/2026 |
| 0.1.9 | 370 | 1/19/2026 |
| 0.1.8 | 685 | 12/11/2025 |
| 0.1.7 | 258 | 11/23/2025 |
| 0.1.6 | 326 | 11/14/2025 |
| 0.1.5 | 249 | 11/2/2025 |
| 0.1.4 | 241 | 10/31/2025 |
| 0.1.3 | 261 | 10/12/2025 |
| 0.1.2 | 252 | 10/2/2025 |
| 0.1.1 | 442 | 9/18/2025 |
| 0.1.0 | 345 | 9/17/2025 |