👁 Image
NeuralCodecs 0.4.0

.NET 8.0

dotnet add package NeuralCodecs --version 0.4.0

NuGet\Install-Package NeuralCodecs -Version 0.4.0

This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.

<PackageReference Include="NeuralCodecs" Version="0.4.0" />

For projects that support PackageReference, copy this XML node into the project file to reference the package.

<PackageVersion Include="NeuralCodecs" Version="0.4.0" />
 

 Directory.Packages.props

<PackageReference Include="NeuralCodecs" />
 

 Project file

For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.

paket add NeuralCodecs --version 0.4.0

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

#r "nuget: NeuralCodecs, 0.4.0"

#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.

#:package NeuralCodecs@0.4.0

#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.

#addin nuget:?package=NeuralCodecs&version=0.4.0
 

 Install as a Cake Addin

#tool nuget:?package=NeuralCodecs&version=0.4.0
 

 Install as a Cake Tool

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

NeuralCodecs 👁 NuGet Version

NeuralCodecs is a .NET library for neural audio codec implementations and TTS models written purely in C#. It includes implementations of SNAC, DAC, Encodec, and Dia, along with advanced audio processing tools.

Features

SNAC: Multi-Scale Neural Audio Codec
- Support for multiple sampling rates: 24kHz, 32kHz, and 44.1kHz
- Attention mechanisms with adjustable window sizes for improved quality
- Automatic resampling for input flexibility
DAC: Descript Audio Codec
- Supports multiple sampling rates: 16kHz, 24kHz, and 44.1kHz
- Configurable encoder/decoder architecture with variable rates
- Flexible bitrate configurations from 8kbps to 16kbps
Encodec: Meta's Encodec neural audio compression
- Supports stereo audio at 24kHz and 48kHz sample rates
- Variable bitrate compression (1.5-24 kbps)
- Neural language model for enhanced compression quality
- Direct file compression to .ecdc format
Dia: Nari Labs' Dia text-to-speech model
- 1.6B parameter text-to-speech model for highly realistic dialogue generation
- Direct transcript-to-speech generation with emotion and tone control
- Audio-conditioned generation for voice cloning and style transfer
- Support for non-verbal communications (laughter, coughing, throat clearing, etc.)
- Speaker-aware dialogue generation with [S1] and [S2] tags
- Custom dynamic speed control to handle Dia's issue with automatic speed-up on long inputs
AudioTools: Advanced audio processing utilities
- Based on Descript's audiotools Python package
- Extended with .NET-specific optimizations and additional features
- Audio filtering, transformation, and effects processing
- Works with Descript's AudioSignal or Tensors
Audio Visualization: Example project includes spectrogram generation and comparison tools

Requirements

.NET 8.0 or later
TorchSharp or libTorch compatible with your platform
NAudio (for audio processing)
SkiaSharp (for visualization features)

Installation

Install the main package from NuGet:

dotnet add package NeuralCodecs

Or the Package Manager Console:

Install-Package NeuralCodecs

Model Downloads

Models will be automatically downloaded given the huggingface user/model, or can be downloaded separately:

SNAC Models - Available from hubersiuzdak's HuggingFace

DAC Models - Available from Descript's HuggingFace

Encodec Models - Available from Meta's HuggingFace

Dia Model - Available from Nari Labs' HuggingFace

Requires both Dia model weights and DAC codec for full audio generation

Quick Start

Here's a simple example to get you started:

using NeuralCodecs;

// Load a SNAC model
var model = await NeuralCodecs.CreateSNACAsync("path/to/model.pt");

// Process audio
float[] audioData = LoadAudioFile("input.wav");
var compressed = model.ProcessAudio(audioData, sampleRate: 24000);

// Save the result
SaveAudioFile("output.wav", compressed);

For more detailed examples, see the examples section below.

Usage

Creating/loading the model

There are several ways to load a model:

Using static factory method:

// Load SNAC model with static method provided for built-in models
var model = await NeuralCodecs.CreateSNACAsync("model.pt");

Using premade config:
SnacConfig provides premade configurations for 24kHz, 32kHz, and 44kHz sampling rates.

var model = await NeuralCodecs.CreateSNACAsync(modelPath, SNACConfig.SNAC24Khz);

Using IModelLoader instance with default config:
Allows the use of custom loader implementations

// Load model with default config from IModelLoader instance
var torchLoader = NeuralCodecs.CreateTorchLoader();
var model = await torchLoader.LoadModelAsync<SNAC, SNACConfig>("model.pt");

Using IModelLoader instance with custom config:

// For Encodec with custom bandwidth and settings
var encodecConfig = new EncodecConfig { 
 SampleRate = 48000,
 Bandwidth = 12.0f,
 Channels = 2, // Stereo audio
 Normalize = true
};
var encodecModel = await torchLoader.LoadModelAsync<Encodec, EncodecConfig>("encodec_model.pt", encodecConfig);

Using factory method for custom models:
Allows the use of custom model implementations with built-in or custom loaders

// Load custom model with factory method
var model = await torchLoader.LoadModelAsync<CustomModel, CustomConfig>(
 "model.pt",
 config => new CustomModel(config, ...),
 config);

Models can be loaded in Pytorch or Safetensors format.

AudioTools Features

The AudioTools namespace provides extensive audio processing capabilities:

var audio = new Tensor(...); // Load or create audio tensor

// Apply effects
var processedAudio = AudioEffects.ApplyCompressor(
 audio, 
 sampleRate: 48000,
 threshold: -20f,
 ratio: 4.0f);

// Compute spectrograms and transforms
var spectrogram = DSP.MelSpectrogram(audio, sampleRate);
var stft = DSP.STFT(audio, windowSize: 1024, hopSize: 512, windowType: "hann");

Encoding and Decoding Audio

There are two main ways to process audio:

Using the simplified ProcessAudio method:

// Compress audio in one step
var processedAudio = model.ProcessAudio(audioData, sampleRate);

Using separate encode and decode steps:

// Encode audio to compressed format
var codes = model.Encode(buffer);

// Decode back to audio
var processedAudio = model.Decode(codes);

Saving the processed audio

Use your preferred method to save WAV files

// using NAudio
await using var writer = new WaveFileWriter(
 outputPath,
 new WaveFormat(model.Config.SamplingRate, channels: model.Channels)
);
writer.WriteSamples(processedAudio, 0, processedAudio.Length);

Encodec-Specific Features

Encodec provides additional capabilities:

// Set target bandwidth for compression (supported values depend on model)
encodecModel.SetTargetBandwidth(12.0f); // 12 kbps

// Get available bandwidth options
var availableBandwidths = encodecModel.TargetBandwidths; // e.g. [1.5, 3, 6, 12, 24]

// Use language model for enhanced compression quality
var lm = await encodecModel.GetLanguageModel();
// Apply LM during encoding/decoding for better quality

// Direct file compression
await EncodecCompressor.CompressToFileAsync(encodecModel, audioTensor, "audio.ecdc", useLm: true);

// Decompress from file
var (decompressedAudio, sampleRate) = await EncodecCompressor.DecompressFromFileAsync("audio.ecdc");

Dia Text-to-Speech Features

Dia is a 1.6B parameter text-to-speech model that generates highly realistic dialogue directly from transcripts:

// Load Dia model with optional DAC codec
var diaConfig = new DiaConfig 
{ 
 LoadDACModel = true,
 SampleRate = 44100 
};
var diaModel = NeuralCodecs.CreateDiaAsync("model.pt", diaconfig)

// or use LoadDACModel = false in config and manually load DAC:
diaModel.LoadDacModel("dac_model.pt");

// Basic text-to-speech generation
var text = "[S1] Hello, how are you today? [S2] I'm doing great, thanks for asking!";
var audioOutput = diaModel.Generate(
 text: text,
 maxTokens: 1000,
 cfgScale: 3.0f,
 temperature: 1.2f,
 topP: 0.95f);

// Voice cloning with audio prompt
var audioPromptPath = "reference_voice.wav";
var clonedAudio = diaModel.Generate(
 text: "[S1] This is my cloned voice speaking new words.",
 audioPromptPath: audioPromptPath,
 maxTokens: 1000);

// Batch generation for multiple texts
var texts = new List<string>
{
 "[S1] First dialogue line.",
 "[S2] Second dialogue line with (laughs) non-verbal."
};
var batchResults = diaModel.Generate(texts, maxTokens: 800);

// Save generated audio
Dia.SaveAudio("output.wav", audioOutput);

Advanced Dia Configuration

Audio Speed Correction: Dia includes built-in speed correction to handle the automatic speed-up issue on longer inputs:

var diaConfig = new DiaConfig 
{ 
 LoadDACModel = true,
 SampleRate = 44100,
 // Configure speed correction method
 SpeedCorrectionMethod = AudioSpeedCorrectionMethod.Hybrid, // Default: best quality
 // Configure slowdown mode
 SlowdownMode = AudioSlowdownMode.Dynamic // Default: adapts to text length
};

Available speed correction methods:

None: No speed correction applied
TorchSharp: TorchSharp-based linear interpolation
Hybrid: Combines TorchSharp and NAudio methods (recommended)
NAudioResampling: Uses NAudio resampling for speed correction
All: Creates separate outputs using all methods (for testing/comparison)

Available slowdown modes:

Static: Uses a fixed slowdown factor
Dynamic: Adjusts slowdown based on text length (recommended)

Speed Correction Examples:

// For highest quality output (default)
var highQualityConfig = new DiaConfig 
{ 
 SpeedCorrectionMethod = AudioSpeedCorrectionMethod.Hybrid,
 SlowdownMode = AudioSlowdownMode.Dynamic
};

// For testing multiple correction methods
var testConfig = new DiaConfig 
{ 
 SpeedCorrectionMethod = AudioSpeedCorrectionMethod.All // Generates multiple output variants
};

// For no speed correction (fastest processing)
var fastConfig = new DiaConfig 
{ 
 SpeedCorrectionMethod = AudioSpeedCorrectionMethod.None
};

Dia Generation Guidelines

Memory Usage: Similar to the python implementation, ~10-11GB GPU memory is required for the Dia model with DAC codec.

Text Format Requirements:

Always begin input text with [S1] speaker tag
Alternate between [S1] and [S2] for dialogue (repeating the same speaker tag consecutively may impact generation)
Keep input text moderate length (10-20 seconds of corresponding audio)

Non-Verbal Communications: Dia supports various non-verbal tags. Some work more consistently than others (laughs, chuckles), but be prepared for occasional unexpected output from some tags (sneezes, applause, coughs ...)

var textWithNonVerbals = "[S1] I can't believe it! (gasps) [S2] That's amazing! (laughs)";

Supported non-verbals: (laughs), (clears throat), (sighs), (gasps), (coughs), (singing), (sings), (mumbles), (beep), (groans), (sniffs), (claps), (screams), (inhales), (exhales), (applause), (burps), (humming), (sneezes), (chuckle), (whistles)

Voice Cloning Best Practices:

Provide 5-10 seconds of reference audio for optimal results
Include the transcript of the reference audio before your generation text
Use correct speaker tags in the reference transcript
Approximately 1 second per 86 tokens for duration estimation

// Voice cloning example with transcript
var referenceTranscript = "[S1] This is the reference voice speaking clearly.";
var newText = "[S1] Now I will say something completely different.";
var clonedOutput = diaModel.Generate(
 text: referenceTranscript + " " + newText,
 audioPromptPath: "reference.wav");

Example

Check out the Example project for a complete implementation, including:

Model loading and configuration
Audio processing workflows
Command-line interface implementation
Audio Visualization

The example includes tools for visualizing and comparing audio spectrograms:

Audio before and after compression with DAC Codec 24kHz
<img src="Docs/Images/spectrogram_DAC_24k.png" width="500" height="300">

Acknowledgments

SNAC - hubertsiuzdak's original python implementation
Descript Audio Codec - Descript's original python implementation
Encodec - Meta's original python implementation
Dia - Nari Labs' original python implementation

Contributing

Suggestions and contributions are welcome! Here's how you can help:

Ways to Contribute

Bug Reports: Submit issues with reproduction steps
Feature Requests: Propose new codec implementations or features
Code Contributions: Submit pull requests with improvements
Documentation: Help improve examples and documentation
Testing: Test with different models and platforms

License

This project is licensed under the Apache-2.0 License, see the LICENSE file for more information.
This project uses libraries under several different licenses, see THIRD-PARTY-NOTICES for more information.

Product	Versions Compatible and additional computed target framework versions.
.NET	net8.0 net8.0 is compatible. net8.0-android net8.0-android was computed. net8.0-browser net8.0-browser was computed. net8.0-ios net8.0-ios was computed. net8.0-maccatalyst net8.0-maccatalyst was computed. net8.0-macos net8.0-macos was computed. net8.0-tvos net8.0-tvos was computed. net8.0-windows net8.0-windows was computed. net9.0 net9.0 was computed. net9.0-android net9.0-android was computed. net9.0-browser net9.0-browser was computed. net9.0-ios net9.0-ios was computed. net9.0-maccatalyst net9.0-maccatalyst was computed. net9.0-macos net9.0-macos was computed. net9.0-tvos net9.0-tvos was computed. net9.0-windows net9.0-windows was computed. net10.0 net10.0 was computed. net10.0-android net10.0-android was computed. net10.0-browser net10.0-browser was computed. net10.0-ios net10.0-ios was computed. net10.0-maccatalyst net10.0-maccatalyst was computed. net10.0-macos net10.0-macos was computed. net10.0-tvos net10.0-tvos was computed. net10.0-windows net10.0-windows was computed.

Product

Versions Compatible and additional computed target framework versions.

.NET

net8.0 net8.0 is compatible. net8.0-android net8.0-android was computed. net8.0-browser net8.0-browser was computed. net8.0-ios net8.0-ios was computed. net8.0-maccatalyst net8.0-maccatalyst was computed. net8.0-macos net8.0-macos was computed. net8.0-tvos net8.0-tvos was computed. net8.0-windows net8.0-windows was computed. net9.0 net9.0 was computed. net9.0-android net9.0-android was computed. net9.0-browser net9.0-browser was computed. net9.0-ios net9.0-ios was computed. net9.0-maccatalyst net9.0-maccatalyst was computed. net9.0-macos net9.0-macos was computed. net9.0-tvos net9.0-tvos was computed. net9.0-windows net9.0-windows was computed. net10.0 net10.0 was computed. net10.0-android net10.0-android was computed. net10.0-browser net10.0-browser was computed. net10.0-ios net10.0-ios was computed. net10.0-maccatalyst net10.0-maccatalyst was computed. net10.0-macos net10.0-macos was computed. net10.0-tvos net10.0-tvos was computed. net10.0-windows net10.0-windows was computed.

Compatible target framework(s)

Included target framework(s) (in package)

Learn more about Target Frameworks and .NET Standard.

net8.0
- NAudio (>= 2.2.1)
- TorchAudio (>= 0.105.0)
- TorchSharp (>= 0.105.0)
- TorchSharp.PyBridge (>= 1.4.3)
- TorchSharp-cuda-windows (>= 0.105.0)

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version	Downloads	Last Updated
0.4.0	517	6/10/2025
0.3.1	330	4/9/2025
0.2.0	214	2/12/2025
0.1.5	202	2/1/2025
0.1.4	201	2/1/2025
0.1.3	209	11/27/2024
0.1.1	238	11/23/2024
0.1.0	193	11/21/2024

URL: https://www.nuget.org/packages/NeuralCodecs/

⇱ NuGet Gallery | NeuralCodecs 0.4.0

👁 Image
NeuralCodecs 0.4.0

NeuralCodecs 👁 NuGet Version

Features

Requirements

Installation

Model Downloads

Quick Start

Usage

Creating/loading the model

Using static factory method:

Using premade config:

Using IModelLoader instance with default config:

Using IModelLoader instance with custom config:

Using factory method for custom models:

AudioTools Features

Encoding and Decoding Audio

Encodec-Specific Features

Dia Text-to-Speech Features

Advanced Dia Configuration

Available speed correction methods:

Available slowdown modes:

Dia Generation Guidelines

Example

Acknowledgments

Contributing

Ways to Contribute

License

net8.0

NuGet packages

GitHub repositories

URL: https://www.nuget.org/packages/NeuralCodecs/

⇱ NuGet Gallery | NeuralCodecs 0.4.0

👁 Image NeuralCodecs 0.4.0

NeuralCodecs 👁 NuGet Version

Features

Requirements

Installation

Model Downloads

Quick Start

Usage

Creating/loading the model

Using static factory method:

Using premade config:

Using IModelLoader instance with default config:

Using IModelLoader instance with custom config:

Using factory method for custom models:

AudioTools Features

Encoding and Decoding Audio

Encodec-Specific Features

Dia Text-to-Speech Features

Advanced Dia Configuration

Available speed correction methods:

Available slowdown modes:

Dia Generation Guidelines

Example

Acknowledgments

Contributing

Ways to Contribute

License

net8.0

NuGet packages

GitHub repositories

👁 Image
NeuralCodecs 0.4.0