![]() |
VOOZH | about |
dotnet add package Mythosia.AI.Rag --version 7.5.0
NuGet\Install-Package Mythosia.AI.Rag -Version 7.5.0
<PackageReference Include="Mythosia.AI.Rag" Version="7.5.0" />
<PackageVersion Include="Mythosia.AI.Rag" Version="7.5.0" />Directory.Packages.props
<PackageReference Include="Mythosia.AI.Rag" />Project file
paket add Mythosia.AI.Rag --version 7.5.0
#r "nuget: Mythosia.AI.Rag, 7.5.0"
#:package Mythosia.AI.Rag@7.5.0
#addin nuget:?package=Mythosia.AI.Rag&version=7.5.0Install as a Cake Addin
#tool nuget:?package=Mythosia.AI.Rag&version=7.5.0Install as a Cake Tool
Mythosia.AI.Rag provides RAG (Retrieval-Augmented Generation) as an optional extension for Mythosia.AI.
Install this package to add .WithRag() to any IAIService — no changes to the AI core required.
Abstractions Compatibility: Implements
Mythosia.AI.Rag.Abstractions v6.x
dotnet add package Mythosia.AI.Rag
using Mythosia.AI.Rag;
var service = new AnthropicService(apiKey, httpClient)
.WithRag(rag => rag
.AddDocument("manual.txt")
.AddDocument("policy.txt")
);
var response = await service.GetCompletionAsync("What is the refund policy?");
That's it. Documents are automatically loaded, chunked, embedded, and indexed on the first query (lazy initialization).
.WithRag(rag => rag
// Single file
.AddDocument("docs/manual.txt")
// All files in a directory (recursive)
.AddDocuments("./knowledge-base/")
// Per-extension routing in a directory
.AddDocuments("./knowledge-base/", src => src
.WithExtension(".pdf")
.WithLoader(new PdfDocumentLoader())
.WithTextSplitter(new CharacterTextSplitter(800, 80))
)
.AddDocuments("./knowledge-base/", src => src
.WithExtension(".docx")
.WithLoader(new WordDocumentLoader())
.WithTextSplitter(new TokenTextSplitter(600, 60))
)
// Inline text
.AddText("Product price is $99.", id: "price-info")
// URL (fetched via HTTP GET)
.AddUrl("https://example.com/faq.txt")
// Custom loader
.AddDocuments(new MyPdfLoader(), "docs/manual.pdf")
)
.WithRag(rag => rag
.AddDocument("docs.txt")
.WithTopK(5) // Number of results to retrieve (default: 3)
.WithChunkSize(500) // Characters per chunk (default: 300)
.WithChunkOverlap(50) // Overlap between chunks (default: 30)
.WithScoreThreshold(0.5) // Minimum similarity score (default: none)
)
Combine dense vector similarity with BM25 keyword matching using Reciprocal Rank Fusion (RRF). Documents that rank highly in both keyword and semantic search are boosted to the top.
For stores that support native hybrid storage/search, the recommended model is:
SearchAsync for vector-only retrievalHybridSearchAsync for hybrid retrievalIf a store does not support native hybrid retrieval, the RAG layer falls back to application-level fusion automatically.
.WithRag(rag => rag
.AddDocument("docs.txt")
.UseHybridSearch() // Enable hybrid search (default weight: 0.5)
)
Adjust the balance between vector and keyword search:
.UseHybridSearch(vectorWeight: 0.7f) // 70% vector, 30% keyword
| Store Type | Behavior |
|---|---|
| InMemoryVectorStore | Application-level BM25 index + vector search, merged via RRF |
| PostgresStore | Native parallel tsvector full-text + pgvector similarity, merged via RRF |
| QdrantStore | Native sparse-dense prefetch + Qdrant's built-in RRF fusion |
| PineconeStore | Native dense + sparse server-side fusion on dotproduct indexes |
The strategy is selected automatically based on the store — no configuration needed.
To revert to pure vector search:
.UseVectorSearch() // Explicit pure vector mode (same as default)
Re-rank search results after retrieval for improved relevance. Works with both pure vector and hybrid search.
When a reranker is configured, the pipeline automatically fetches a wider candidate pool (TopK × TopKMultiplier) and then the reranker selects the best TopK results. This ensures the reranker has enough diversity to work with.
// Default: retrieves TopK × 3 candidates, reranks down to TopK
.WithRag(rag => rag
.AddDocument("docs.txt")
.WithReranker(new CohereReranker(cohereApiKey))
)
// Custom multiplier via RagStore.UpdateOptions
store.UpdateOptions(opt => opt.DefaultQuery.RetrievalDerivation.TopKMultiplier = 5);
using Mythosia.AI.Rag.Reranking;
.WithRag(rag => rag
.AddDocument("docs.txt")
.WithReranker(new CohereReranker(cohereApiKey))
)
Use any existing AIService to score and reorder results:
using Mythosia.AI.Rag.Reranking;
var scorer = new OpenAIService(apiKey, httpClient, AIModel.OpenAI_Gpt4oMini);
.WithRag(rag => rag
.AddDocument("docs.txt")
.WithReranker(new LlmReranker(scorer))
)
Use a vLLM-served reranker model (e.g., Qwen3-Reranker):
using Mythosia.AI.Rag.Reranking;
.WithRag(rag => rag
.AddDocument("docs.txt")
.WithReranker(new VllmReranker(
model: "Qwen/Qwen3-Reranker-0.6B",
baseUrl: "http://localhost:8003"))
)
By default, the pipeline trusts the reranker's scores for final result selection (RerankerOnly). Use WithFinalSelectionPolicy to blend retrieval and reranker scores instead:
.WithRag(rag => rag
.AddDocument("docs.txt")
.WithReranker(new CohereReranker(cohereApiKey))
.WithFinalSelectionPolicy(RagFinalSelectionMode.WeightedBlend, retrievalWeight: 0.65)
)
.WithRag(rag => rag
.AddDocument("docs.txt")
.UseHybridSearch(vectorWeight: 0.6f)
.WithReranker(new CohereReranker(cohereApiKey))
)
// Local feature-hashing (default, no API key required)
.UseLocalEmbedding(dimensions: 1024)
// OpenAI embedding API
.UseOpenAIEmbedding(apiKey, model: "text-embedding-3-small", dimensions: 1536)
// vLLM-served embedding model
.UseEmbedding(new VllmEmbeddingProvider(
httpClient,
model: "Qwen/Qwen3-Embedding-0.6B",
dimensions: 1024,
baseUrl: "http://localhost:8002"))
// Custom provider
.UseEmbedding(new MyCustomEmbeddingProvider())
// In-memory (default, data lost on process exit)
.UseInMemoryStore()
// Custom store (e.g., Qdrant, Chroma, Pinecone)
.UseStore(new MyQdrantVectorStore())
.WithPromptTemplate(@"
[Reference Documents]
{context}
[Question]
{question}
Answer based only on the provided documents.
")
Use {context} and {question} placeholders. If no template is specified, a default numbered-reference format is used.
By default, follow-up questions like "Tell me more about that" fail in RAG because the search query lacks context from previous turns. WithQueryRewriter() solves this by automatically rewriting follow-up queries into retrieval-ready form before vector search, and can also derive keyword terms for hybrid/text retrieval.
var service = new OpenAIService(apiKey, httpClient)
.WithRag(rag => rag
.AddDocument("manual.txt")
.WithQueryRewriter() // Enables automatic query rewriting and retrieval keyword derivation
);
// Turn 1: "Do you know about OPM?" → RAG finds OPM documents ✓
var r1 = await service.GetCompletionAsync("Do you know about OPM?");
// Turn 2: "Tell me more about that" → rewritten to "Tell me more about OPM" → RAG finds OPM documents ✓
var r2 = await service.GetCompletionAsync("Tell me more about that");
Use a cheaper/smaller LLM for rewriting and retrieval keyword derivation to reduce cost:
var rewriterService = new OpenAIService(apiKey, httpClient, AIModel.OpenAI_Gpt4oMini);
var service = new OpenAIService(apiKey, httpClient, AIModel.OpenAI_Gpt4o)
.WithRag(rag => rag
.AddDocument("manual.txt")
.WithQueryRewriter(new LlmQueryRewriter(rewriterService))
);
You can also provide a fully custom IQueryRewriter implementation:
.WithRag(rag => rag
.AddDocument("manual.txt")
.WithQueryRewriter(new MyCustomRewriter())
)
Inspect the rewritten query via RagProcessedQuery.RewrittenQuery:
var result = await service.RetrieveAsync("Tell me more about that");
Console.WriteLine(result.RewrittenQuery); // "Tell me more about OPM"
var ragService = new OpenAIService(apiKey, httpClient)
.WithRag(rag => rag.AddDocument("manual.txt"));
await foreach (var chunk in ragService.StreamAsync("How do I use this product?"))
{
Console.Write(chunk);
}
BuildAsync accepts an optional onDocumentEmbedded callback invoked after each document's embedding is complete. When omitted, the pipeline automatically calls ReplaceByFilterAsync(Where("document_id", docId), records) — which deletes all existing chunks for that document and inserts the new ones atomically. When provided, the callback replaces this default behavior entirely — you decide how to persist the records.
On PostgresStore, ReplaceByFilterAsync wraps DELETE + INSERT in a single transaction — queries always see either the old data or the new data, never an empty gap. Other stores (InMemory, Qdrant, Pinecone) perform sequential delete + insert via the default interface method.
Use the callback for logging, validation, or routing to different stores:
var store = await RagStore.BuildAsync(config => config
.AddDocuments("./docs/")
.UseOpenAIEmbedding(apiKey)
.UseStore(vectorStore),
onDocumentEmbedded: async records =>
{
Console.WriteLine($"Indexed {records.Count} chunks");
await vectorStore.UpsertBatchAsync(records);
}
);
In standard RAG the pipeline runs once per user message. In Agentic RAG the agent decides when to search, what to search for, and whether to search again if the first result is insufficient — all autonomously inside a ReAct loop.
Register the RagStore as a search tool with WithAgenticRag, then run RunAgentAsync for a final answer or RunAgentStreamAsync for streaming:
// Build the index once
var ragStore = await RagStore.BuildAsync(cfg => cfg
.AddDocument("manual.pdf")
.AddDocument("policy.docx")
.UseOpenAIEmbedding(apiKey));
// Register RAG as a tool and run the agent
var service = new AnthropicService(apiKey, http);
service.WithAgenticRag(ragStore);
var answer = await service.RunAgentAsync("Summarise the refund policy.");
// Build the index once
var ragStore = await RagStore.BuildAsync(cfg => cfg
.AddDocument("manual.pdf")
.AddDocument("policy.docx")
.UseOpenAIEmbedding(apiKey));
// Register RAG as a tool and stream the agent run
var service = new AnthropicService(apiKey, http);
service.WithAgenticRag(ragStore);
await foreach (var content in service.RunAgentStreamAsync(
"Summarise the refund policy and mention the key eligibility rules.",
maxSteps: 10))
{
if (content.Type == StreamingContentType.FunctionCall)
{
Console.WriteLine($"Searching docs via: {content.Metadata["function_name"]}");
}
else if (content.Type == StreamingContentType.Text)
{
Console.Write(content.Content);
}
}
RunAgentStreamAsync(...) keeps token streaming while still emitting tool-call and tool-result events from the agent loop.
service.WithAgenticRag(ragStore)
.WithFunctionAsync("get_order_status", "Look up an order status by order ID.",
("order_id", "The order ID to look up.", required: true),
async id => await orderApi.GetStatusAsync(id));
// The agent searches documents for policy AND calls the API for live order data
var answer = await service.RunAgentAsync(
"Order #12345 — am I eligible for a refund based on the current policy?");
The tool description controls when the agent decides to call RAG. Tailor it to your domain:
service.WithAgenticRag(ragStore,
toolDescription:
"Search internal HR policies, product manuals, and compliance documents. " +
"Call this tool whenever company-specific policy or product information is needed.");
Use queryOptions when each agent search step needs a fresh RagQueryOptions.
If the host app wants structured access to References, Diagnostics, and other
RAG metadata, register tracing separately with WithAgenticRagTracing(...).
These calls have separate responsibilities:
WithAgenticRag(...) registers the RAG search tool and resolves per-call query options.WithAgenticRagTracing(...) registers trace observers for Agentic RAG search executions.var traces = new List<AgenticRagSearchTrace>();
service.WithAgenticRag(
ragStore,
queryOptions: _ => new RagQueryOptions
{
StoreFilter = new VectorFilter()
.Where("tenant", currentTenantId)
.Where("storage_id", currentStorageId)
},
toolDescription: "Search only the documents the current user is allowed to access.")
.WithAgenticRagTracing(trace =>
{
traces.Add(trace);
});
queryOptions receives an AgenticRagQueryContext with the current tool name and self-contained search query.
Use _ => ... when the filter is fixed for the whole request, or inspect ctx.Query / ctx.ToolName
when the filter or retrieval policy should vary by search step.
Tracing is registered by service instance and tool name. If you customize the Agentic RAG tool name,
pass the same name to WithAgenticRagTracing(...):
service
.WithAgenticRag(ragStore, toolName: "search_private_docs")
.WithAgenticRagTracing(
trace => traces.Add(trace),
toolName: "search_private_docs");
Each AgenticRagSearchTrace contains:
Query — the self-contained query generated by the agent for that search stepQueryOptions — the resolved per-call RagQueryOptionsResult.References — final selected referencesResult.RetrievalCandidates / Result.RerankedCandidates — pre-final-selection candidatesResult.Diagnostics — applied TopK/min-score and elapsed timingsSucceeded / Exception ??whether the search completed successfully and why it failed when it did notThis makes it easier to implement permission-aware Agentic RAG, reference panels, search-quality analysis,
and audit logging without coupling RunAgentAsync(...) itself to RAG-specific request types.
| Standard RAG | Agentic RAG | |
|---|---|---|
| Search timing | Every message | Agent decides |
| Query formulation | QueryRewriter | Agent itself |
| Number of searches | Once per turn | One or more as needed |
| Tool combination | Not applicable | Any registered tool |
| Setup | .WithRag() |
.WithAgenticRag() + RunAgentAsync / RunAgentStreamAsync |
QueryRewriteris intentionally bypassed in Agentic RAG. The agent formulates its own self-contained search query, so a separate rewriting step is redundant and could distort the agent's intent.
Build the index once, share across multiple AI services:
var ragStore = await RagStore.BuildAsync(config => config
.AddDocuments("./knowledge-base/")
.UseOpenAIEmbedding(embeddingApiKey)
.WithTopK(5)
);
var claude = new AnthropicService(claudeKey, http).WithRag(ragStore);
var gpt = new OpenAIService(gptKey, http).WithRag(ragStore);
// Both use the same pre-built index
var resp1 = await claude.GetCompletionAsync("What is the refund policy?");
var resp2 = await gpt.GetCompletionAsync("How long does shipping take?");
Update pipeline options at runtime without rebuilding the index:
ragStore.UpdateOptions(opt =>
{
opt.DefaultQuery.FinalFilter.TopK = 8;
opt.DefaultQuery.FinalFilter.MinScore = 0.4;
opt.DefaultQuery.RetrievalDerivation.TopKMultiplier = 3;
opt.PromptTemplate = @"
[Reference Documents]
{context}
[Question]
{question}
Answer based only on the provided documents.
";
});
UpdateOptions applies the delegate to a cloned options snapshot and swaps it into the pipeline when complete, so in-flight queries continue using the previous snapshot instead of observing partially updated settings.
var ragService = service.WithRag(rag => rag.AddDocument("doc.txt"));
// Use RAG
var withRag = await ragService.GetCompletionAsync("question with context");
// Temporarily bypass RAG
var withoutRag = await ragService.WithoutRag().GetCompletionAsync("general question");
Inspect the request message content and references before sending to the LLM:
var result = await ragService.RetrieveAsync("What is the refund policy?");
if (result.HasReferences)
{
Console.WriteLine(result.RequestMessageContent); // Context + query
Console.WriteLine(result.References.Count); // Number of matched chunks
Console.WriteLine($"FinalTopK={result.Diagnostics.FinalTopK}, RetrievalTopK={result.Diagnostics.RetrievalTopK}, FinalMinScore={result.Diagnostics.AppliedFinalMinScore}, Elapsed={result.Diagnostics.ElapsedMs}ms");
foreach (var r in result.References)
{
Console.WriteLine($"Score: {r.Score:F4} | {r.Record.Content}");
}
}
else
{
// No references found — RequestMessageContent contains the original query unchanged
Console.WriteLine(result.RequestMessageContent);
}
Keep global defaults in RagBuilder, then override per request when needed:
var ragStore = await RagStore.BuildAsync(config => config
.AddDocuments("./knowledge-base/")
.WithTopK(3)
.WithScoreThreshold(0.5)
);
var normal = await ragStore.QueryAsync("refund policy?");
var highRecall = await ragStore.QueryAsync(
"refund policy?",
new RagQueryOptions
{
FinalFilter = new RagFilter { TopK = 15, MinScore = 0.2 }
}
);
Clone()When you maintain a baseline RagQueryOptions (e.g. tenant StoreFilter + a ProgressAsync callback) and want per-query variations on top, use RagQueryOptions.Clone() so every other field is preserved:
// Baseline carried across many queries
var baseline = new RagQueryOptions
{
StoreFilter = new VectorFilter().Where("tenant", currentTenantId),
ProgressAsync = stage => { Console.WriteLine($"Stage: {stage}"); return Task.CompletedTask; }
};
// Per-query override — Clone() keeps StoreFilter and ProgressAsync
var highRecall = baseline.Clone();
highRecall.FinalFilter.TopK = 15;
highRecall.FinalFilter.MinScore = 0.2;
var result = await ragStore.QueryAsync("refund policy?", highRecall);
Clone() is a deep copy for the option records (FinalFilter, RetrievalDerivation, FinalSelection) and a reference copy for the handle-typed fields (ProgressAsync, StoreFilter). Reassign those properties explicitly when you want a different callback or filter for that one call.
Use RagQueryOptions.StoreFilter to pass a VectorFilter directly to IVectorStore.SearchAsync / HybridSearchAsync on every retrieval call. This allows scoping retrieval by tenant, user, category, time range, or any metadata key without wrapping the store in a custom decorator.
// Single condition
var options = new RagQueryOptions();
options.StoreFilter = new VectorFilter().Where("storage_id", storageId);
var result = await ragStore.QueryAsync("질문", options, cancellationToken);
// Multiple conditions — storage_id AND folder_path (AND logic, fluent chaining)
options.StoreFilter = new VectorFilter()
.Where("storage_id", storageId)
.Where("folder_path", "/docs/private");
// Multi-value filter — only documents from specific tenants
options.StoreFilter = new VectorFilter()
.WhereIn("storage_id", tenantId1, tenantId2, tenantId3);
// Tenant isolation + user scoping via metadata
options.StoreFilter = new VectorFilter()
.Where("tenant", "tenant-A")
.Where("user_id", currentUserId);
For this to work, the metadata keys must be stored on VectorRecord.Metadata at index time:
var doc = new RagDocument
{
Content = "문서 내용...",
Metadata = new Dictionary<string, string>
{
["storage_id"] = storageId,
["user_id"] = userId
}
};
await ragPipeline.IndexDocumentAsync(doc, cancellationToken: ct);
StoreFilter = null (the default) preserves the existing behavior with no filtering.
Track pipeline stage progress with an async callback:
var result = await ragStore.QueryAsync("refund policy?",
new RagQueryOptions
{
ProgressAsync = stage =>
{
Console.WriteLine($"Stage: {stage}");
return Task.CompletedTask;
}
});
Mythosia.AI.Abstractions <- IAIService interface
|
Mythosia.AI.Rag.Abstractions <- interfaces (IRagPipeline, ITextSplitter, etc.), RagDocument
|
Mythosia.AI.Rag <- fluent API, pipeline, builders, extensions
Mythosia.VectorDb.InMemory (optional) <- InMemoryVectorStore
Mythosia.Documents.Abstractions <- IDocumentLoader, DoclingDocument
The AI core has zero knowledge of RAG. Everything is wired through the IRagPipeline interface and C# extension methods.
public class MyEmbeddingProvider : IEmbeddingProvider
{
public int Dimensions => 768;
public Task<float[]> GetEmbeddingAsync(string text, CancellationToken ct = default)
{
// Your embedding logic
}
public Task<IReadOnlyList<float[]>> GetEmbeddingsAsync(IEnumerable<string> texts, CancellationToken ct = default)
{
// Batch embedding logic
}
}
public class MyVectorStore : IVectorStore
{
// Implement: CreateCollectionAsync, UpsertAsync, SearchAsync, DeleteAsync, etc.
}
public class MyPdfLoader : IDocumentLoader
{
public Task<IReadOnlyList<DoclingDocument>> LoadAsync(string source, CancellationToken ct = default)
{
// Parse PDF and return documents
}
}
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net5.0 net5.0 was computed. net5.0-windows net5.0-windows was computed. net6.0 net6.0 was computed. net6.0-android net6.0-android was computed. net6.0-ios net6.0-ios was computed. net6.0-maccatalyst net6.0-maccatalyst was computed. net6.0-macos net6.0-macos was computed. net6.0-tvos net6.0-tvos was computed. net6.0-windows net6.0-windows was computed. net7.0 net7.0 was computed. net7.0-android net7.0-android was computed. net7.0-ios net7.0-ios was computed. net7.0-maccatalyst net7.0-maccatalyst was computed. net7.0-macos net7.0-macos was computed. net7.0-tvos net7.0-tvos was computed. net7.0-windows net7.0-windows was computed. net8.0 net8.0 was computed. net8.0-android net8.0-android was computed. net8.0-browser net8.0-browser was computed. net8.0-ios net8.0-ios was computed. net8.0-maccatalyst net8.0-maccatalyst was computed. net8.0-macos net8.0-macos was computed. net8.0-tvos net8.0-tvos was computed. net8.0-windows net8.0-windows was computed. net9.0 net9.0 was computed. net9.0-android net9.0-android was computed. net9.0-browser net9.0-browser was computed. net9.0-ios net9.0-ios was computed. net9.0-maccatalyst net9.0-maccatalyst was computed. net9.0-macos net9.0-macos was computed. net9.0-tvos net9.0-tvos was computed. net9.0-windows net9.0-windows was computed. net10.0 net10.0 was computed. net10.0-android net10.0-android was computed. net10.0-browser net10.0-browser was computed. net10.0-ios net10.0-ios was computed. net10.0-maccatalyst net10.0-maccatalyst was computed. net10.0-macos net10.0-macos was computed. net10.0-tvos net10.0-tvos was computed. net10.0-windows net10.0-windows was computed. |
| .NET Core | netcoreapp3.0 netcoreapp3.0 was computed. netcoreapp3.1 netcoreapp3.1 was computed. |
| .NET Standard | netstandard2.1 netstandard2.1 is compatible. |
| MonoAndroid | monoandroid monoandroid was computed. |
| MonoMac | monomac monomac was computed. |
| MonoTouch | monotouch monotouch was computed. |
| Tizen | tizen60 tizen60 was computed. |
| Xamarin.iOS | xamarinios xamarinios was computed. |
| Xamarin.Mac | xamarinmac xamarinmac was computed. |
| Xamarin.TVOS | xamarintvos xamarintvos was computed. |
| Xamarin.WatchOS | xamarinwatchos xamarinwatchos was computed. |
This package is not used by any NuGet packages.
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated | |
|---|---|---|---|
| 7.5.0 | 443 | 4/28/2026 | |
| 7.4.0 | 264 | 4/16/2026 | |
| 7.4.0-preview1 | 123 | 4/10/2026 | |
| 7.3.2 | 121 | 4/10/2026 | |
| 7.3.1 | 144 | 4/6/2026 | |
| 7.3.0 | 117 | 4/4/2026 | |
| 7.2.0 | 114 | 4/3/2026 | |
| 7.1.0 | 117 | 4/2/2026 | |
| 7.0.1 | 105 | 4/1/2026 | |
| 7.0.0 | 120 | 3/30/2026 | |
| 6.2.0 | 112 | 3/29/2026 | |
| 6.1.0 | 118 | 3/28/2026 | |
| 6.0.1 | 109 | 3/24/2026 | |
| 6.0.0 | 109 | 3/22/2026 | |
| 5.0.1 | 154 | 3/15/2026 | |
| 5.0.0 | 124 | 3/15/2026 | 5.0.0 is deprecated because it has critical bugs. |
| 4.0.0 | 128 | 3/11/2026 | 4.0.0 is deprecated because it has critical bugs. |
| 3.1.0 | 131 | 3/7/2026 | 3.1.0 is deprecated because it has critical bugs. |
| 3.0.0 | 109 | 3/6/2026 | |
| 2.0.0 | 115 | 3/5/2026 |
v7.5.0: Fixed RagPipeline.QueryAsync(query, topK, ...) silently dropping ProgressAsync and StoreFilter from RagPipelineOptions.DefaultQuery — the convenience overload now clones DefaultQuery via the new RagQueryOptions.Clone() and only overrides FinalFilter.TopK. Removed the PromptTemplate-keyed context builder cache that could tear under concurrent queries, and made RagStore.UpdateOptions configure a cloned options snapshot before atomically swapping it into the pipeline.