![]() |
VOOZH | about |
dotnet add package Mythosia.AI --version 6.6.0
NuGet\Install-Package Mythosia.AI -Version 6.6.0
<PackageReference Include="Mythosia.AI" Version="6.6.0" />
<PackageVersion Include="Mythosia.AI" Version="6.6.0" />Directory.Packages.props
<PackageReference Include="Mythosia.AI" />Project file
paket add Mythosia.AI --version 6.6.0
#r "nuget: Mythosia.AI, 6.6.0"
#:package Mythosia.AI@6.6.0
#addin nuget:?package=Mythosia.AI&version=6.6.0Install as a Cake Addin
#tool nuget:?package=Mythosia.AI&version=6.6.0Install as a Cake Tool
⚠️ Upgrading from v5.x? See the .
The Mythosia.AI library provides a unified interface for various AI models with multimodal support, function calling, reasoning streaming, round-level token usage, and advanced streaming capabilities.
dotnet add package Mythosia.AI
For advanced LINQ operations with streams:
dotnet add package System.Linq.Async
For RAG (Retrieval-Augmented Generation) support:
dotnet add package Mythosia.AI.Rag
This adds .WithRag() to any AIService, enabling document-based context augmentation. See the Mythosia.AI.Rag README for full usage details.
using Mythosia.AI.Rag;
var service = new AnthropicService(apiKey, httpClient)
.WithRag(rag => rag
.AddDocument("manual.txt")
.AddDocument("policy.txt")
);
var response = await service.GetCompletionAsync("What is the refund policy?");
// OpenAI GPT
var gptService = new OpenAIService(apiKey, httpClient);
var response = await gptService.GetCompletionAsync("Hello!");
// Anthropic Claude
var claudeService = new AnthropicService(apiKey, httpClient);
var response = await claudeService.GetCompletionAsync("Hello!");
// Google Gemini
var geminiService = new GoogleAIService(apiKey, httpClient);
geminiService.ChangeModel(AIModels.Google.Gemini3FlashPreview);
var response = await geminiService.GetCompletionAsync("Hello!");
AIModels CatalogModel selection is now documented around provider-grouped string constants via AIModels.
service.ChangeModel(AIModels.OpenAI.Gpt5_4);
service.ChangeModel(AIModels.Anthropic.ClaudeSonnet4_6);
service.ChangeModel(AIModels.Google.Gemini3FlashPreview);
For simple stateless usage, use AIService static helpers.
var answer = await AIService.QuickAskAsync(apiKey, "Summarize this text.");
var vision = await AIService.QuickAskWithImageAsync(apiKey, "Describe this image.", imagePath);
GPT-5 family models (GPT-5 / 5.1 / 5.2 / 5.3 / 5.4 / 5.5) support type-safe reasoning configuration with per-model enums.
Each GPT-5 variant has its own enum to ensure only valid options are available at compile time.
var gptService = (OpenAIService)service;
// GPT-5: Gpt5Reasoning (Auto/Minimal/Low/Medium/High)
gptService.WithGpt5Parameters(
reasoningEffort: Gpt5Reasoning.High,
reasoningSummary: ReasoningSummary.Concise);
// GPT-5.1: Gpt5_1Reasoning (Auto/None/Low/Medium/High) + Verbosity
gptService.WithGpt5_1Parameters(
reasoningEffort: Gpt5_1Reasoning.Medium,
verbosity: Verbosity.Low,
reasoningSummary: ReasoningSummary.Concise);
// GPT-5.2: Gpt5_2Reasoning (Auto/None/Low/Medium/High/XHigh) + Verbosity
gptService.WithGpt5_2Parameters(
reasoningEffort: Gpt5_2Reasoning.XHigh,
verbosity: Verbosity.High);
// GPT-5.3 Codex: Gpt5_3Reasoning (Auto/None/Low/Medium/High/XHigh) + Verbosity
gptService.WithGpt5_3Parameters(
reasoningEffort: Gpt5_3Reasoning.Medium,
verbosity: Verbosity.Medium,
reasoningSummary: ReasoningSummary.Concise);
// GPT-5.4 / 5.4 Pro: Gpt5_4Reasoning (Auto/None/Low/Medium/High/XHigh) + Verbosity
gptService.WithGpt5_4Parameters(
reasoningEffort: Gpt5_4Reasoning.Auto,
verbosity: Verbosity.High,
reasoningSummary: ReasoningSummary.Auto);
// GPT-5.5 / 5.5 Pro: Gpt5_5Reasoning (Auto/None/Low/Medium/High/XHigh) + Verbosity
gptService.WithGpt5_5Parameters(
reasoningEffort: Gpt5_5Reasoning.High,
verbosity: Verbosity.Medium,
reasoningSummary: ReasoningSummary.Concise);
Auto uses the model-appropriate default (e.g., Medium for GPT-5, None for GPT-5.1/5.2, Medium for GPT-5.2 Pro/Codex, Medium for GPT-5.3 Codex, None for GPT-5.4, Medium for GPT-5.4 Pro, None for GPT-5.5, Medium for GPT-5.5 Pro). The -pro variants reject None/Low and are clamped up to Medium.
All GPT-5 family models support ReasoningSummary enum (Auto / Concise / Detailed). Set to null to disable.
var geminiService = new GoogleAIService(apiKey, httpClient);
geminiService.ChangeModel(AIModels.Google.Gemini3FlashPreview);
// GeminiThinkingLevel enum: Auto / Minimal / Low / Medium / High
geminiService.ThinkingLevel = GeminiThinkingLevel.Low; // Auto = model default (High)
geminiService.ChangeModel(AIModels.Google.Gemini2_5Pro);
geminiService.ThinkingBudget = 8192; // -1 = dynamic (default), 0 = disable
includeThoughts)When streaming with StreamOptions.WithReasoning(), Mythosia.AI now requests Gemini thought chunks (includeThoughts: true) and emits them as StreamingContentType.Reasoning.
await foreach (var content in geminiService.StreamAsync(message, new StreamOptions().WithReasoning()))
{
if (content.Type == StreamingContentType.Reasoning)
Console.Write($"[Gemini Thinking] {content.Content}");
else if (content.Type == StreamingContentType.Text)
Console.Write(content.Content);
}
var grokService = new XAIService(apiKey, httpClient);
grokService.ChangeModel(AIModels.xAI.Grok3Mini);
// GrokReasoning enum: Off / Low / High
grokService.WithGrokParameters(reasoningEffort: GrokReasoning.High);
Note: Only
grok-3-minisupports thereasoning_effortAPI parameter. Other Grok models ignore it.
Grok reasoning models (grok-3-mini, grok-4.3) stream reasoning_content when reasoning is enabled:
await foreach (var content in grokService.StreamAsync(message, new StreamOptions().WithReasoning()))
{
if (content.Type == StreamingContentType.Reasoning)
Console.Write($"[Think] {content.Content}");
else if (content.Type == StreamingContentType.Text)
Console.Write(content.Content);
}
AIRequestProfileApply one-shot runtime overrides per request without mutating long-lived service configuration.
var response = await service.GetCompletionAsync(
"Rewrite this query for retrieval.",
RequestProfiles.QueryRewrite);
AIRequestContextUse request-scoped prompt injection when you need to pass derived prompt data only for the current call without polluting the real conversation history or the service's base system message.
Available fields:
| Field | Purpose |
|---|---|
SystemMessagePrefix |
Text prepended to the system message for this request only |
SystemMessageSuffix |
Text appended to the system message for this request only |
AdditionalMessages |
Extra messages injected into the conversation for this request only (reference docs, few-shot examples) |
RequestMessageOverride |
Completely replaces the user message sent to the model while the original prompt stays in chat history |
Example — a query rewriter flow where the original user question should remain in chat history, but a retrieval-friendly rewrite is what actually gets sent to the model:
var rewrittenQuery = await service.GetCompletionAsync(
"Rewrite this question for retrieval.",
RequestProfiles.QueryRewrite);
var response = await service.GetCompletionAsync(
originalUserQuestion,
new AIRequestContext
{
RequestMessageOverride = new Message(ActorRole.User, rewrittenQuery)
});
Example — injecting retrieved RAG context as a suffix on the system message, without leaking it into conversation history:
var answer = await service.GetCompletionAsync(userQuestion,
new AIRequestContext
{
SystemMessageSuffix = $"\n\nUse the following context to answer:\n{retrievedDocs}"
});
For the full flow and before/after comparisons, see docs/request-contexts.md.
SystemMessageProvider — Automatic Baseline InjectionWhen the same dynamic data (today's date, active folder, session info) must be injected on every LLM call, passing an AIRequestContext at every entry point gets tedious and error-prone. AIService.SystemMessageProvider lets you register a callback once, and every outbound call (GetCompletionAsync, StreamAsync, RunAgentAsync, RunAgentStreamAsync) automatically invokes it to build a baseline context.
// Register once — typically at service construction / DI setup
service.WithSystemMessageProvider(() => new AIRequestContext
{
SystemMessageSuffix =
$"Today is {DateTime.UtcNow:yyyy-MM-dd}.\n" +
$"Current folder: {_uiContext.CurrentFolder}"
});
// Every call below automatically receives the baseline context
var answer = await service.GetCompletionAsync(userQuery);
await foreach (var chunk in service.StreamAsync(msg, options)) { /* ... */ }
var agentResult = await service.RunAgentAsync(goal);
When the baseline comes from a database, cache, or HTTP call, use the async overload so the provider does not have to block on .Result. Overload resolution picks the right one by lambda arity — no arg for sync, one CancellationToken for async:
service.WithSystemMessageProvider(async ct =>
{
var prefs = await _db.UserPreferences.FirstOrDefaultAsync(ct);
return new AIRequestContext
{
SystemMessageSuffix = $"User language: {prefs?.Language ?? "en"}"
};
});
Streaming paths (StreamAsync, RunAgentStreamAsync) forward the caller's CancellationToken through to the async provider. Non-streaming paths (GetCompletionAsync, RunAgentAsync) do not support cancellation — use the streaming counterparts if your provider needs to be cancellable.
When a call also passes an explicit AIRequestContext, the two merge field-by-field: explicit values win on scalar fields (SystemMessagePrefix, SystemMessageSuffix, RequestMessageOverride); AdditionalMessages concatenates (provider first, then explicit).
Available in Mythosia.AI v6.3.0+. Full details in docs/request-contexts.md.
// Define a simple function
var service = new OpenAIService(apiKey, httpClient)
.WithFunction(
"get_weather",
"Gets the current weather for a location",
("location", "The city and country", required: true),
(string location) => $"The weather in {location} is sunny, 22°C"
);
// AI will automatically call the function when needed
var response = await service.GetCompletionAsync("What's the weather in Seoul?");
// Output: "The weather in Seoul is currently sunny with a temperature of 22°C."
public class WeatherService
{
[AiFunction("get_current_weather", "Gets the current weather for a location")]
public string GetWeather(
[AiParameter("The city name", required: true)] string city,
[AiParameter("Temperature unit", required: false)] string unit = "celsius")
{
// Your implementation
return $"Weather in {city}: 22°{unit[0]}";
}
}
// Register all functions from a class
var weatherService = new WeatherService();
var service = new OpenAIService(apiKey, httpClient)
.WithFunctions(weatherService);
var service = new OpenAIService(apiKey, httpClient)
.WithFunction(FunctionBuilder.Create("calculate")
.WithDescription("Performs mathematical calculations")
.AddParameter("expression", "string", "The math expression", required: true)
.AddParameter("precision", "integer", "Decimal places", required: false, defaultValue: 2)
.WithHandler(async (args) =>
{
var expr = args["expression"].ToString();
var precision = Convert.ToInt32(args.GetValueOrDefault("precision", 2));
// Calculate and return result
return await CalculateAsync(expr, precision);
})
.Build());
var service = new OpenAIService(apiKey, httpClient)
// Parameterless function
.WithFunction(
"get_time",
"Gets the current time",
() => DateTime.Now.ToString("HH:mm:ss")
)
// Two-parameter function
.WithFunction(
"add_numbers",
"Adds two numbers",
("a", "First number", true),
("b", "Second number", true),
(double a, double b) => $"The sum is {a + b}"
)
// Async function
.WithFunctionAsync(
"fetch_data",
"Fetches data from API",
("endpoint", "API endpoint", true),
async (string endpoint) => await httpClient.GetStringAsync(endpoint)
);
// The AI will automatically use the appropriate functions
var response = await service.GetCompletionAsync(
"What time is it? Also, what's 15 plus 27?"
);
// Pre-defined policies
service.DefaultPolicy = FunctionCallingPolicy.Fast; // 30s timeout, 10 rounds
service.DefaultPolicy = FunctionCallingPolicy.Complex; // 300s timeout, 50 rounds
service.DefaultPolicy = FunctionCallingPolicy.Vision; // 200s timeout, for image analysis
// Custom policy
service.DefaultPolicy = new FunctionCallingPolicy
{
MaxRounds = 25,
TimeoutSeconds = 120,
MaxConcurrency = 5,
EnableLogging = true // Enable debug output
};
// Per-request policy override
var response = await service
.WithPolicy(FunctionCallingPolicy.Fast)
.GetCompletionAsync("Complex task requiring functions");
// Inline policy configuration
var response = await service
.BeginMessage()
.AddText("Analyze this data")
.WithMaxRounds(5)
.WithTimeout(60)
.SendAsync();
// Stream with function calling support
await foreach (var content in service.StreamAsync(
"What's the weather in Seoul and calculate 15% tip on $85",
StreamOptions.WithFunctions))
{
if (content.Type == StreamingContentType.FunctionCall)
{
Console.WriteLine($"Calling function: {content.Metadata["function_name"]}");
}
else if (content.Type == StreamingContentType.FunctionResult)
{
Console.WriteLine($"Function completed: {content.Metadata["status"]}");
}
else if (content.Type == StreamingContentType.Text)
{
Console.Write(content.Content);
}
}
// Non-streaming agent helper
var answer = await service.RunAgentAsync(
"Find the weather in Seoul and explain what to wear today."
);
// Streaming agent helper
await foreach (var content in service.RunAgentStreamAsync(
"Find the weather in Seoul and explain what to wear today.",
maxSteps: 10))
{
if (content.Type == StreamingContentType.FunctionCall)
{
Console.WriteLine($"Calling: {content.Metadata["function_name"]}");
}
else if (content.Type == StreamingContentType.FunctionResult)
{
Console.WriteLine($"Tool result: {content.Content}");
}
else if (content.Type == StreamingContentType.Text)
{
Console.Write(content.Content);
}
}
RunAgentStreamAsync(...) is the streaming counterpart to RunAgentAsync(...). It keeps function calling enabled for the request and disables TextOnly so agent runs can emit function call, function result, and completion events.
// Disable functions for a single request
var response = await service
.WithoutFunctions()
.GetCompletionAsync("Don't use any functions for this");
// Or use the async helper
var response = await service.AskWithoutFunctionsAsync(
"Process this without calling functions"
);
Deserialize LLM responses directly into C# POCOs with automatic JSON recovery.
// Define your POCO
public class WeatherResponse
{
public string City { get; set; }
public double Temperature { get; set; }
public string Condition { get; set; }
}
// Get typed result — schema is auto-generated and sent to the LLM
var result = await service.GetCompletionAsync<WeatherResponse>(
"What's the weather in Seoul?");
Console.WriteLine($"{result.City}: {result.Temperature}°C, {result.Condition}");
When the LLM returns invalid JSON, a correction prompt is automatically sent asking the model to fix its output. This is not a network retry — it's an output quality/format correction loop.
// Configure service-level retry count (default: 2)
service.StructuredOutputMaxRetries = 3;
// On final failure, StructuredOutputException is thrown with rich diagnostics:
// - FirstRawResponse, LastRawResponse
// - ParseError, AttemptCount, SchemaJson, TargetTypeName
Override retry behavior for a single request without changing service defaults:
// Custom policy — applies only to this call, then auto-cleared
var result = await service
.WithStructuredOutputPolicy(new StructuredOutputPolicy { MaxRepairAttempts = 5 })
.GetCompletionAsync<MyDto>(prompt);
// Preset: no retry (1 attempt only)
var result = await service
.WithNoRetryStructuredOutput()
.GetCompletionAsync<MyDto>(prompt);
// Preset: strict mode (up to 3 retries = 4 total attempts)
var result = await service
.WithStrictStructuredOutput()
.GetCompletionAsync<MyDto>(prompt);
| Preset | MaxRepairAttempts | Description |
|---|---|---|
Default |
null (service default) |
Uses StructuredOutputMaxRetries |
NoRetry |
0 |
Single attempt, no retry |
Strict |
3 |
Up to 3 correction retries |
Stream text chunks in real-time to the UI while getting a final deserialized object with auto-repair:
var run = service.BeginStream(prompt)
.WithStructuredOutput(new StructuredOutputPolicy { MaxRepairAttempts = 2 })
.As<MyDto>();
// Optional: observe chunks in real-time
await foreach (var chunk in run.Stream(cancellationToken))
{
Console.Write(chunk); // UI display
}
// Final deserialized result (waits for stream + parse/repair)
MyDto dto = await run.Result;
Result works without Stream() — just await run.Result internally consumes the stream and parsesStream() is single-use — second call throws InvalidOperationExceptionResult waits for stream completion — even if awaited mid-stream, it won't resolve earlyGetCompletionAsync() for efficiencyList<T>, T[])Both GetCompletionAsync<T>() and streaming support collection types — no wrapper DTO needed:
// Non-streaming: get a list directly
var items = await service.GetCompletionAsync<List<ItemDto>>(
"Extract all entities from this document...");
// Streaming: observe chunks + get list result
var run = service.BeginStream(prompt).As<List<ItemDto>>();
await foreach (var chunk in run.Stream()) Console.Write(chunk);
List<ItemDto> items = await run.Result;
List<T>, T[], IReadOnlyList<T> are all supported. JSON array schema is auto-generated from the element type.
Automatically summarize old conversation messages when the conversation exceeds a configured threshold. The summary is stored and injected into the system message on each subsequent LLM request.
// Token-based: summarize when total tokens exceed 3000, keep recent ~1000 tokens
service.ConversationPolicy = SummaryConversationPolicy.ByToken(
triggerTokens: 3000,
keepRecentTokens: 1000
);
// Message-count-based: summarize when messages exceed 20, keep last 5
service.ConversationPolicy = SummaryConversationPolicy.ByMessage(
triggerCount: 20,
keepRecentCount: 5
);
// Combined (OR condition): triggers when either threshold is exceeded
service.ConversationPolicy = SummaryConversationPolicy.ByBoth(
triggerTokens: 3000,
triggerCount: 20
);
// Just use as normal — summarization happens automatically
service.ConversationPolicy = SummaryConversationPolicy.ByMessage(triggerCount: 20, keepRecentCount: 5);
var response = await service.GetCompletionAsync("Continue our conversation...");
// When message count exceeds 20, old messages are summarized automatically
// Save summary for later
string saved = service.ConversationPolicy.CurrentSummary;
// Restore in a new session
policy.LoadSummary(saved);
StatelessMode = true to prevent polluting the main conversation historyConversationPolicy defaults to null; existing behavior is unchanged// Text only - fastest, no overhead
await foreach (var chunk in service.StreamAsync("Hello", StreamOptions.TextOnlyOptions))
{
Console.Write(chunk.Content);
}
// With metadata - includes model info, timestamps, etc.
await foreach (var content in service.StreamAsync("Hello", StreamOptions.FullOptions))
{
if (content.Metadata != null)
{
Console.WriteLine($"Model: {content.Metadata["model"]}");
}
Console.Write(content.Content);
}
// Custom options
var options = new StreamOptions()
.WithMetadata(true)
.WithFunctionCalls(true)
.AsTextOnly(false);
await foreach (var content in service.StreamAsync("Query", options))
{
// Process based on content.Type
switch (content.Type)
{
case StreamingContentType.Text:
Console.Write(content.Content);
break;
case StreamingContentType.FunctionCall:
Console.WriteLine($"Calling: {content.Metadata["function_name"]}");
break;
case StreamingContentType.Completion:
Console.WriteLine($"Total length: {content.Metadata["total_length"]}");
break;
}
}
When an SSE stream dies mid-flight against a self-hosted backend (vLLM, ollama, internal proxy), you usually need to know exactly where it died. Register diagnostic hooks once on the service — every subsequent StreamAsync call picks them up automatically. Same fluent builder pattern as WithRag.
using Mythosia.AI.Extensions;
service.WithStreamDiagnostics(d => d
.OnRawLine(line => logger.LogDebug("SSE: {Line}", line))
.OnComplete(diag => logger.LogInformation("Stream finished: {Diag}", diag)));
await foreach (var chunk in service.StreamAsync(message))
Console.Write(chunk.Content);
Each On* method is independent — register only what you need:
// Raw line trace only
service.WithStreamDiagnostics(d => d.OnRawLine(line => logger.LogDebug("SSE: {Line}", line)));
// Clear all hooks
service.WithStreamDiagnostics(_ => { });
When SSE reading throws, the library wraps the exception in StreamReadException with a StreamDiagnostics snapshot taken at the moment of failure. This works regardless of whether WithStreamDiagnostics was registered:
try
{
await foreach (var chunk in service.StreamAsync(message))
Console.Write(chunk.Content);
}
catch (StreamReadException ex)
{
logger.LogError(ex,
"Stream died after {Lines} lines, {Chars} chars. Last raw line: {Line}",
ex.Diagnostics.LinesRead,
ex.Diagnostics.AccumulatedTextLength,
ex.Diagnostics.LastRawLine);
// ex.InnerException carries the original exception (IOException, etc.)
}
StreamDiagnostics exposes LinesRead, DataLinesProcessed, ParseFailures, AccumulatedTextLength, LastRawLine, and Elapsed. Hooks are propagated through CopyFrom, so cross-provider switches in a multi-provider chat UI keep the registered diagnostics without re-registration.
Available in Mythosia.AI v6.4.0+. Full guide: docs/streaming.md.
Streaming exposes token usage in two different places, with different meanings:
StreamingContentType.RoundUsage: usage for one LLM round only.StreamingContentType.Completion: cumulative usage for the whole streaming run.For a single LLM call, the final RoundUsage.Usage and Completion.Usage should describe
the same one-round request. For an agent or function-calling run, each LLM round emits its own
RoundUsage, while the final Completion.Usage remains the sum of all rounds.
This distinction is important for UI context meters. If you want to show "how many tokens the
current conversation state used when it entered the latest LLM call", use the latest
RoundUsage.Usage.InputTokens. If you want cost or diagnostics for the full agent run, use
Completion.Usage.TotalTokens.
RoundUsage events also include:
RoundIndex: 1-based LLM round number.IsFinalRound: true when this is the last LLM round in the stream.await foreach (var content in service.StreamAsync(message, StreamOptions.FullOptions))
{
if (content.Type == StreamingContentType.Text)
Console.Write(content.Content);
if (content.Type == StreamingContentType.RoundUsage && content.Usage != null)
{
Console.WriteLine($"Round: {content.RoundIndex}");
Console.WriteLine($"Round total: {content.Usage.TotalTokens}");
Console.WriteLine($"Final round: {content.IsFinalRound}");
}
if (content.Type == StreamingContentType.Completion && content.Usage != null)
{
Console.WriteLine($"Input tokens: {content.Usage.InputTokens}");
Console.WriteLine($"Output tokens: {content.Usage.OutputTokens}");
Console.WriteLine($"Cached tokens: {content.Usage.CachedInputTokens}");
Console.WriteLine($"Reasoning tokens: {content.Usage.ReasoningTokens}");
Console.WriteLine($"Cache hit ratio: {content.Usage.CacheHitRatio:P1}");
}
}
int? contextTokenMeter = null;
TokenUsage? cumulativeRunUsage = null;
await foreach (var content in service.RunAgentStreamAsync(
"Find the weather in Seoul and answer briefly.",
maxSteps: 10))
{
if (content.Type == StreamingContentType.RoundUsage && content.Usage != null)
{
// Best value for a UI context/token meter.
contextTokenMeter = content.Usage.InputTokens;
Console.WriteLine(
$"Round {content.RoundIndex}: input={content.Usage.InputTokens}, total={content.Usage.TotalTokens} tokens");
if (content.IsFinalRound)
{
Console.WriteLine($"Final context meter value: {contextTokenMeter}");
}
continue;
}
if (content.Type == StreamingContentType.Completion)
{
// Cumulative usage across the whole agent run.
cumulativeRunUsage = content.Usage;
continue;
}
if (content.Type == StreamingContentType.Text)
Console.Write(content.Content);
}
RoundUsage.Usage is never an accumulated run total. It represents that one LLM round.RoundUsage.Usage.TotalTokens is normalized to InputTokens + OutputTokens.Completion.Usage keeps the existing cumulative meaning for the full stream or agent run.IsFinalRound = false; the last round has IsFinalRound = true.IncludeMetadata. Usage can still be emitted when metadata is disabled.RoundUsage and Completion events rather than provider-specific chunk metadata.usageMetadata chunks can still become RoundUsage.The Token test category contains provider-level tests for this contract. If those tests pass
for a provider/model, Mythosia.AI considers round-level usage and final cumulative usage supported
for that provider/model. If a provider/model does not return official usage, these tests should fail
or be treated as unsupported for token usage.
TokenUsage fields:
| Field | Description | Providers |
|---|---|---|
InputTokens |
Input/prompt tokens | All |
OutputTokens |
Output/completion tokens | All |
TotalTokens |
Total tokens used | All |
CachedInputTokens |
Tokens served from cache | OpenAI, Claude, DeepSeek, Gemini |
CacheCreationTokens |
Tokens written to cache | Claude |
ReasoningTokens |
Internal reasoning tokens | OpenAI, Gemini |
Computed properties: NonCachedInputTokens, CacheHitRatio, HasCacheActivity, VisibleOutputTokens.
GPT-5, Gemini 3, and Grok reasoning models support streaming reasoning (thinking) content.
await foreach (var content in service.StreamAsync(message, new StreamOptions().WithReasoning()))
{
if (content.Type == StreamingContentType.Reasoning)
Console.WriteLine($"[Thinking] {content.Content}");
else if (content.Type == StreamingContentType.Text)
Console.Write(content.Content);
}
| Service | Function Calling | Streaming | Reasoning | Notes |
|---|---|---|---|---|
| OpenAI GPT-5.5 / 5.5 Pro / 5 Pro | ✅ | ✅ | ✅ | Per-model reasoning enums + verbosity |
| OpenAI GPT-5.4 / 5.4 Pro | ✅ | ✅ | ✅ | Per-model reasoning enums + verbosity |
| OpenAI GPT-5.3 Codex | ✅ | ✅ | ✅ | Per-model reasoning enums + verbosity |
| OpenAI GPT-5.2 / 5.2 Pro / 5.2 Codex | ✅ | ✅ | ✅ | Per-model reasoning enums + verbosity |
| OpenAI GPT-5.1 | ✅ | ✅ | ✅ | Reasoning + verbosity control |
| OpenAI GPT-5 / Mini / Nano | ✅ | ✅ | ✅ | Reasoning streaming + summary |
| OpenAI GPT-4.1 / GPT-4o | ✅ | ✅ | — | Full function support |
| OpenAI o3 / o3-pro | ✅ | ✅ | ✅ | Advanced reasoning |
| Claude Fable 5 | ✅ | ✅ | ✅ | Adaptive thinking + tool use |
| Claude Opus 4.8 / 4.7 / 4.6 / 4.5 / 4.1 / 4 | ✅ | ✅ | ✅ | Extended thinking + tool use |
| Claude Sonnet 4.6 / 4.5 | ✅ | ✅ | ✅ | Extended thinking + tool use |
| Claude Haiku 4.5 | ✅ | ✅ | ✅ | Extended thinking + tool use |
| Gemini 3.1 Pro / 3.5 Flash / 3 Flash / 3.1 Flash-Lite | ✅ | ✅ | ✅ | ThinkingLevel + thought signatures |
| Gemini 2.5 Pro/Flash | ✅ | ✅ | ✅ | ThinkingBudget control |
| xAI Grok 4.3 / 4.20 / Build 0.1 / 3 Mini | ✅ | ✅ | ✅ | GrokReasoning effort + reasoning streaming |
| DeepSeek | ❌ | ✅ | ✅ | Reasoner model streaming |
| Perplexity | ❌ | ✅ | — | Web search + citations |
public class WeatherAssistant
{
private readonly OpenAIService _service;
private readonly HttpClient _httpClient;
public WeatherAssistant(string apiKey)
{
_httpClient = new HttpClient();
_service = new OpenAIService(apiKey, _httpClient)
.WithSystemMessage("You are a helpful weather assistant.")
.WithFunction(
"get_weather",
"Gets current weather for a city",
("city", "City name", true),
GetWeatherData
)
.WithFunction(
"get_forecast",
"Gets weather forecast",
("city", "City name", true),
("days", "Number of days", false),
GetForecast
);
// Configure function calling behavior
_service.DefaultPolicy = new FunctionCallingPolicy
{
MaxRounds = 10,
TimeoutSeconds = 30,
EnableLogging = true
};
}
private string GetWeatherData(string city)
{
// In real implementation, call weather API
return $"{{\"city\":\"{city}\",\"temp\":22,\"condition\":\"sunny\"}}";
}
private string GetForecast(string city, int days = 3)
{
// In real implementation, call forecast API
return $"{{\"city\":\"{city}\",\"forecast\":\"{days} days of sun\"}}";
}
public async Task<string> AskAsync(string question)
{
return await _service.GetCompletionAsync(question);
}
public async IAsyncEnumerable<string> StreamAsync(string question)
{
await foreach (var content in _service.StreamAsync(question))
{
if (content.Type == StreamingContentType.Text && content.Content != null)
{
yield return content.Content;
}
}
}
}
// Usage
var assistant = new WeatherAssistant(apiKey);
// Functions are called automatically
var response = await assistant.AskAsync("What's the weather in Tokyo?");
// AI calls get_weather("Tokyo") and responds naturally
// Streaming also supports functions
await foreach (var chunk in assistant.StreamAsync(
"Compare weather in Seoul and Tokyo for the next 5 days"))
{
Console.Write(chunk);
}
var mathTutor = new OpenAIService(apiKey, httpClient)
.WithSystemMessage("You are a math tutor. Always explain your reasoning.")
.WithFunction(
"calculate",
"Performs calculations",
("expression", "Math expression", true),
(string expr) => {
// Using a math expression evaluator
var result = EvaluateExpression(expr);
return $"Result: {result}";
}
)
.WithFunction(
"solve_equation",
"Solves equations step by step",
("equation", "Equation to solve", true),
(string equation) => {
var steps = SolveWithSteps(equation);
return JsonSerializer.Serialize(steps);
}
);
// The AI will use functions and explain the process
var response = await mathTutor.GetCompletionAsync(
"Solve the equation 2x + 5 = 13 and verify the answer"
);
// Output includes step-by-step solution with verification
Function Design: Keep functions focused and simple. Complex logic should be broken into multiple functions.
Error Handling: Functions should return meaningful error messages that the AI can understand.
Performance: Use appropriate policies for your use case (Fast for simple tasks, Complex for detailed analysis).
Streaming: Use TextOnlyOptions for best performance when metadata isn't needed.
Testing: Test function calling with various prompts to ensure robust behavior.
Q: Functions aren't being called when expected?
EnableFunctions is true on the serviceQ: Function calling is too slow?
service.DefaultPolicy.TimeoutSeconds = 30FunctionCallingPolicy.Fast for simple operationsQ: How to debug function execution?
service.DefaultPolicy.EnableLogging = trueStreamOptions.FullOptions to see function call metadataQ: Can I use functions with streaming?
StreamOptions.WithFunctions to see function execution in real-timeThe following OpenAI models are not yet supported due to significant API differences:
| Model | API Name | Status | Notes |
|---|---|---|---|
| GPT-5.2 Instant | gpt-5.2-chat-latest |
⏳ Planned | ChatGPT-optimized model; uses a different routing/parameter set than standard Responses API models |
| GPT-5.3 Instant | gpt-5.3-chat-latest |
⏳ Planned | ChatGPT-optimized model; same API constraints as GPT-5.2 Instant |
| GPT-5.3 Codex Spark | gpt-5.3-codex-spark |
⏳ Planned | Research preview; completely different infrastructure (Cerebras-powered, WebSocket-based, text-only) |
chat-latest models (Instant)
gpt-5.2, gpt-5.3-codex) for API usage instead.reasoning.effort, text.verbosity, and other model-specific configurations.gpt-5.3-codex-spark
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net5.0 net5.0 was computed. net5.0-windows net5.0-windows was computed. net6.0 net6.0 was computed. net6.0-android net6.0-android was computed. net6.0-ios net6.0-ios was computed. net6.0-maccatalyst net6.0-maccatalyst was computed. net6.0-macos net6.0-macos was computed. net6.0-tvos net6.0-tvos was computed. net6.0-windows net6.0-windows was computed. net7.0 net7.0 was computed. net7.0-android net7.0-android was computed. net7.0-ios net7.0-ios was computed. net7.0-maccatalyst net7.0-maccatalyst was computed. net7.0-macos net7.0-macos was computed. net7.0-tvos net7.0-tvos was computed. net7.0-windows net7.0-windows was computed. net8.0 net8.0 was computed. net8.0-android net8.0-android was computed. net8.0-browser net8.0-browser was computed. net8.0-ios net8.0-ios was computed. net8.0-maccatalyst net8.0-maccatalyst was computed. net8.0-macos net8.0-macos was computed. net8.0-tvos net8.0-tvos was computed. net8.0-windows net8.0-windows was computed. net9.0 net9.0 was computed. net9.0-android net9.0-android was computed. net9.0-browser net9.0-browser was computed. net9.0-ios net9.0-ios was computed. net9.0-maccatalyst net9.0-maccatalyst was computed. net9.0-macos net9.0-macos was computed. net9.0-tvos net9.0-tvos was computed. net9.0-windows net9.0-windows was computed. net10.0 net10.0 was computed. net10.0-android net10.0-android was computed. net10.0-browser net10.0-browser was computed. net10.0-ios net10.0-ios was computed. net10.0-maccatalyst net10.0-maccatalyst was computed. net10.0-macos net10.0-macos was computed. net10.0-tvos net10.0-tvos was computed. net10.0-windows net10.0-windows was computed. |
| .NET Core | netcoreapp3.0 netcoreapp3.0 was computed. netcoreapp3.1 netcoreapp3.1 was computed. |
| .NET Standard | netstandard2.1 netstandard2.1 is compatible. |
| MonoAndroid | monoandroid monoandroid was computed. |
| MonoMac | monomac monomac was computed. |
| MonoTouch | monotouch monotouch was computed. |
| Tizen | tizen60 tizen60 was computed. |
| Xamarin.iOS | xamarinios xamarinios was computed. |
| Xamarin.Mac | xamarinmac xamarinmac was computed. |
| Xamarin.TVOS | xamarintvos xamarintvos was computed. |
| Xamarin.WatchOS | xamarinwatchos xamarinwatchos was computed. |
Showing the top 2 NuGet packages that depend on Mythosia.AI:
| Package | Downloads |
|---|---|
|
Mythosia.AI.Providers.Alibaba
Alibaba Cloud Qwen provider package for Mythosia.AI. Includes QwenService with expanded Qwen 3 / 3.5 model constants, platform-specific thinking request handling across DashScope, vLLM, and Ollama, token usage streaming support, and Mythosia.AI v6.4.0 compatibility. Documentation - GitHub: https://github.com/AJ-comp/Mythosia.AI - Release Notes: core/Mythosia.AI.Providers.Alibaba/RELEASE_NOTES.md |
|
|
Mythosia.AI.Mcp
MCP (Model Context Protocol) client integration for Mythosia.AI. Connect to any MCP server (stdio or SSE) and automatically register its tools as FunctionDefinitions usable by all AI providers. |
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 6.6.0 | 91 | 6/10/2026 |
| 6.5.0 | 147 | 5/30/2026 |
| 6.4.0 | 423 | 4/28/2026 |
| 6.4.0-preview1 | 173 | 4/25/2026 |
| 6.3.0 | 208 | 4/20/2026 |
| 6.2.0 | 172 | 4/16/2026 |
| 6.1.0 | 191 | 4/10/2026 |
| 6.0.0 | 208 | 4/3/2026 |
| 5.3.0 | 143 | 4/2/2026 |
| 5.2.0 | 161 | 3/29/2026 |
| 5.1.0 | 194 | 3/28/2026 |
| 5.0.1 | 187 | 3/24/2026 |
| 5.0.0 | 323 | 3/15/2026 |
| 4.7.1 | 139 | 3/11/2026 |
| 4.7.0 | 129 | 3/7/2026 |
| 4.6.2 | 269 | 2/27/2026 |
| 4.6.1 | 118 | 2/27/2026 |
| 4.6.0 | 119 | 2/26/2026 |
| 4.5.0 | 119 | 2/26/2026 |
| 4.4.0 | 116 | 2/25/2026 |
v6.6.0: Claude Fable 5 support. Adds Anthropic claude-fable-5 — Anthropic's new top model tier above Opus (1M context window, 128K max output) — with its API contract handled automatically: unsupported temperature omitted, extended thinking via adaptive mode (thinking.type=adaptive + output_config.effort), and the thinking parameter omitted entirely when disabled (Fable 5 rejects an explicit thinking.type=disabled). Fixes Opus 4.7/4.8 max output tokens to 128K (previously capped at the generic opus-4 32K bucket), QuickAskAsync provider routing (case-insensitive model-id matching; the requested model is now actually applied instead of the provider default), and the vision gate silently swapping sonnet-4-x/haiku-4-5 models to Sonnet 4.6. Requires Mythosia.AI.Abstractions v2.4.0.