![]() |
VOOZH | about |
dotnet add package WACS.WASI.NN.OnnxRuntime --version 0.3.2
NuGet\Install-Package WACS.WASI.NN.OnnxRuntime -Version 0.3.2
<PackageReference Include="WACS.WASI.NN.OnnxRuntime" Version="0.3.2" />
<PackageVersion Include="WACS.WASI.NN.OnnxRuntime" Version="0.3.2" />Directory.Packages.props
<PackageReference Include="WACS.WASI.NN.OnnxRuntime" />Project file
paket add WACS.WASI.NN.OnnxRuntime --version 0.3.2
#r "nuget: WACS.WASI.NN.OnnxRuntime, 0.3.2"
#:package WACS.WASI.NN.OnnxRuntime@0.3.2
#addin nuget:?package=WACS.WASI.NN.OnnxRuntime&version=0.3.2Install as a Cake Addin
#tool nuget:?package=WACS.WASI.NN.OnnxRuntime&version=0.3.2Install as a Cake Tool
ONNX Runtime backend for WACS.WASI.NN.
Implements IBackend for graph-encoding.onnx directly against
Microsoft.ML.OnnxRuntime — no
ML.NET wrapper, just ORT.
This is the default wasi-nn backend for the WACS CLI. wacs run --wasip2 --wasi-nn
auto-loads it; embedders who don't want the ~50 MB of ORT native binaries can use one of
the other backends instead.
dotnet add package WACS.WASI.NN.OnnxRuntime
# Bundled with WACS.Cli — works out of the box.
wacs run my.component.wasm --wasip2 --wasi-nn -d ./models::/models
The Gemma 3 270M ONNX SLM is the canonical end-to-end test target:
docs/COMPONENT_CHAINING.md
walks through it.
Interpreter / one-line:
using Wacs.Core.Runtime;
using Wacs.WASI.NN;
using Wacs.WASI.NN.OnnxRuntime;
using Wacs.WASI.NN.Types;
var runtime = new WasmRuntime();
runtime.UseWasiNN(b => b.AddBackend(GraphEncoding.ONNX, new OnnxBackend()));
Transpiler-direct-link / DI:
services
.AddWasiPreview2()
.AddWasiNN(b => b.AddBackend(GraphEncoding.ONNX, new OnnxBackend()))
.AddWasiPreview2NNBundle();
(WasiPreview2RuntimeScope auto-wires OnnxBackend when this assembly is on the load
path — no explicit AddBackend needed.)
OnnxBackend : IBackend — implements LoadGraph(builders, target) for byte-loaded
ONNX models. Suitable for the SLM / inference workflow where the guest reads model
bytes and passes them through wasi:nn/graph.loadOnnxBackendOptions / OnnxExecutionProvider — typed config for execution-provider
selection (CoreML / CUDA / DirectML / ROCm, with auto-detect + CPU fallback)WasiNNOnnxBindable : IBindable — parameterless adapter for --bind. Auto-pulled
by the CLI's --wasi-nn shorthand[assembly: WasiHostPackage] — picked up by runtime.AutoDiscoverHostPackages()Default is CPU — hardware acceleration is opt-in via the
WACS_WASINN_ONNX_EP env var or OnnxBackendOptions.ExecutionProvider. CPU
default avoids the silent op-coverage issues seen with the CoreML / DirectML
EPs against generative-LLM ops (e.g., GroupQueryAttention in Gemma 3): partition-
and-fallback inside ORT can produce numerically wrong results without raising
an error, which manifests as "the LLM doesn't respond" against the Gemma 3 270M
SLM workflow. Pin the EP per-model after you've verified your model works with it.
| OS | WACS_WASINN_ONNX_EP=auto resolves to |
Notes |
|---|---|---|
| macOS (arm64/x64) | CoreML (CPU + GPU) | EP symbol ships in stock Microsoft.ML.OnnxRuntime — no NuGet swap |
| Windows | DirectML | Add Microsoft.ML.OnnxRuntime.DirectML for full DML coverage |
| Linux | CUDA then ROCm | Requires CUDA toolkit / ROCm runtime on host |
| Other | CPU |
EP-append failure silently falls back to CPU regardless — out-of-box behavior favors "inference still works" over "EP misconfiguration is loud".
Enable via environment:
# Platform-best pick (CoreML on macOS, DirectML on Windows, CUDA on Linux)
WACS_WASINN_ONNX_EP=auto wacs run my.wasm --wasip2 --wasi-nn
# Force a specific provider
WACS_WASINN_ONNX_EP=coreml wacs run my.wasm --wasip2 --wasi-nn
WACS_WASINN_ONNX_EP=cuda WACS_WASINN_ONNX_CUDA_DEVICE=1 wacs run my.wasm --wasip2 --wasi-nn
WACS_WASINN_ONNX_EP=dml wacs run my.wasm --wasip2 --wasi-nn
WACS_WASINN_ONNX_EP=rocm wacs run my.wasm --wasip2 --wasi-nn
# Explicitly stay on CPU (default — no env var also gets you CPU)
WACS_WASINN_ONNX_EP=cpu wacs run my.wasm --wasip2 --wasi-nn
Override via typed config (library embedders):
using Wacs.WASI.NN.OnnxRuntime;
using Microsoft.ML.OnnxRuntime;
var backend = new OnnxBackend(new OnnxBackendOptions
{
ExecutionProvider = OnnxExecutionProvider.CoreML,
CoreMLFlags = CoreMLFlags.COREML_FLAG_USE_CPU_AND_GPU
| CoreMLFlags.COREML_FLAG_ONLY_ENABLE_DEVICE_WITH_ANE,
FallbackToCpu = true,
});
Full escape hatch (custom SessionOptions factory):
var backend = new OnnxBackend(() =>
{
var opts = new SessionOptions();
opts.AppendExecutionProvider_CUDA(deviceId: 0);
opts.GraphOptimizationLevel = GraphOptimizationLevel.ORT_ENABLE_ALL;
return opts; // factory wins over OnnxBackendOptions
});
OnnxBackendOptions.FallbackToCpu = false propagates EP-append failures as
ErrorCode.RuntimeError at graph.load time — useful for environments where silent CPU
fallback would mask a misconfiguration.
| Use case | Package |
|---|---|
| Standard ONNX inference (image classification, embeddings, encoder-only LLMs) | WACS.WASI.NN.OnnxRuntime (this) |
| ONNX with ML.NET pipeline integration (preprocessing transformers, custom predictors) | WACS.WASI.NN.MLNet |
GGUF / llama.cpp generative LLMs (load-by-name flow) |
WACS.WASI.NN.LlamaSharp |
docs/WASI_NN_USAGE.md —
unified usage guide (CLI flags, env vars, programmatic embedding, worked examples)docs/COMPONENT_CHAINING.mdWacs.WASI/Wacs.WASI.NN/README.md
— backend matrix + package layoutApache-2.0
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net8.0 net8.0 is compatible. net8.0-android net8.0-android was computed. net8.0-browser net8.0-browser was computed. net8.0-ios net8.0-ios was computed. net8.0-maccatalyst net8.0-maccatalyst was computed. net8.0-macos net8.0-macos was computed. net8.0-tvos net8.0-tvos was computed. net8.0-windows net8.0-windows was computed. net9.0 net9.0 was computed. net9.0-android net9.0-android was computed. net9.0-browser net9.0-browser was computed. net9.0-ios net9.0-ios was computed. net9.0-maccatalyst net9.0-maccatalyst was computed. net9.0-macos net9.0-macos was computed. net9.0-tvos net9.0-tvos was computed. net9.0-windows net9.0-windows was computed. net10.0 net10.0 was computed. net10.0-android net10.0-android was computed. net10.0-browser net10.0-browser was computed. net10.0-ios net10.0-ios was computed. net10.0-maccatalyst net10.0-maccatalyst was computed. net10.0-macos net10.0-macos was computed. net10.0-tvos net10.0-tvos was computed. net10.0-windows net10.0-windows was computed. |
This package is not used by any NuGet packages.
This package is not used by any popular GitHub repositories.