![]() |
VOOZH | about |
dotnet add package D4S.Indexer.Application --version 1.0.20
NuGet\Install-Package D4S.Indexer.Application -Version 1.0.20
<PackageReference Include="D4S.Indexer.Application" Version="1.0.20" />
<PackageVersion Include="D4S.Indexer.Application" Version="1.0.20" />Directory.Packages.props
<PackageReference Include="D4S.Indexer.Application" />Project file
paket add D4S.Indexer.Application --version 1.0.20
#r "nuget: D4S.Indexer.Application, 1.0.20"
#:package D4S.Indexer.Application@1.0.20
#addin nuget:?package=D4S.Indexer.Application&version=1.0.20Install as a Cake Addin
#tool nuget:?package=D4S.Indexer.Application&version=1.0.20Install as a Cake Tool
Document indexing library for Azure AI Search: extracts text, generates vector embeddings, and uploads searchable chunks.
var indexer = IndexerBuilder.Create("my-index")
.WithAzureSearch(searchEndpoint, searchKey)
.WithAzureOpenAI(aoaiEndpoint, aoaiKey, embeddingDeployment, embeddingDimensions)
.WithLocalFiles("./documents")
.WithFileMetadataFields()
.Build();
var result = await indexer.IndexAsync();
See src/Rag/samples/ for working examples (local files, SharePoint, OCR, agentic retrieval).
D4S.Indexer.Domain Entities, abstractions (interfaces)
D4S.Indexer.Application Orchestration (DocumentIndexerService, DocumentExtractor)
D4S.Indexer.Infrastructure Azure implementations, builder, processors, sources
| Interface | Purpose |
|---|---|
IDocumentSource |
Enumerates documents from a data source |
IDocumentProcessor |
Extracts text/metadata from a document |
IEmbeddingService |
Generates vector embeddings |
ISearchIndexService |
Manages the index (CRUD on chunks) |
ITextChunker |
Splits text into chunks |
IOcrService / IKeywordExtractor |
OCR for scans / AI keyword extraction |
Built-in sources: local filesystem, multi-site SharePoint (PnP Core). Built-in processors: PDF, DOCX, XLSX, PPTX, TXT/Markdown.
.WithDeltaMode()): only changed/new/deleted documents are provided; deletion is driven by DocumentMetadata.DeletedDate (set it and pass null for GetContentAsync). No implicit cleanup.Both modes compare LastModifiedDate against the index to skip unchanged documents.
IndexerBuilder.Create("index-name")
// Required
.WithAzureSearch(endpoint, apiKey)
.WithAzureOpenAI(endpoint, apiKey, deployment, dimensions)
// Sources (at least one)
.WithLocalFiles("./docs") // or: opts => { opts.Path = …; opts.FileExtensions = […]; }
.WithSharePointMultiSite(spOptions, contextFactory)
.WithCustomDocumentSource<T>(serviceProvider, serviceKey)
// Optional
.WithDeltaMode()
.WithFileMetadataFields()
.WithChunkSize(maxSize: 1000, overlap: 200)
.WithBatchSize(50)
.WithKeywordExtraction(gptDeployment, maxKeywords: 10)
.WithAzureDocumentIntelligence(endpoint, apiKey) // OCR
.WithCustomDocumentProcessor<T>(serviceProvider, serviceKey)
.ContinueOnError(true)
.Filter(meta => meta.Extension == ".pdf")
.ConfigureMetadata(meta => meta with { CustomFields = … })
.AddCustomField("Status", CustomFieldType.String, filterable: true)
.AddIndexFieldsFromAttributes<MyModel>()
.OnProgress(p => Console.WriteLine(p.Phase))
.WithLogging()
.Build();
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 net10.0 is compatible. net10.0-android net10.0-android was computed. net10.0-browser net10.0-browser was computed. net10.0-ios net10.0-ios was computed. net10.0-maccatalyst net10.0-maccatalyst was computed. net10.0-macos net10.0-macos was computed. net10.0-tvos net10.0-tvos was computed. net10.0-windows net10.0-windows was computed. |
Showing the top 1 NuGet packages that depend on D4S.Indexer.Application:
| Package | Downloads |
|---|---|
|
D4S.Indexer
D4S document indexer for Azure AI Search and RAG workflows. |
This package is not used by any popular GitHub repositories.