VOOZH about

URL: https://deepwiki.com/SciSharp/LLamaSharp/6-configuration-reference

⇱ Configuration Reference | SciSharp/LLamaSharp | DeepWiki


Loading...
Last indexed: 18 May 2026 (ecd184)
Menu

Configuration Reference

This document provides a comprehensive reference for the configuration system in LLamaSharp. Configuration parameters control model loading, context initialization, and inference behavior. For detailed parameter listings, see child pages: Model Parameters (IModelParams), Context Parameters (IContextParams), and Inference Parameters.

Configuration Architecture

LLamaSharp's configuration system is organized into three distinct parameter categories, each controlling a different aspect of the inference pipeline:

Parameter TypeInterfacePrimary ImplementationControls
Model ParametersIModelParamsModelParamsModel loading, GPU offloading, memory management
Context ParametersIContextParamsInherited by ILLamaParamsContext initialization, batch sizes, RoPE configuration
Inference ParametersIInferenceParamsInferenceParamsToken selection strategy, max tokens, antiprompts, overflow strategy

The configuration flow proceeds through distinct stages, bridging managed C# objects to native C++ structures:

Configuration Flow: Managed to Native


Sources: LLama/Common/ModelParams.cs13-158 LLama/Abstractions/IModelParams.cs15-91 LLama/Abstractions/IContextParams.cs9-146 LLama/Extensions/IContextParamsExtensions.cs21-71

Parameter Inheritance and Usage

The ModelParams class implements ILLamaParams, which acts as a unified container for both model-loading and context-initialization settings LLama/Common/ModelParams.cs13-15 This interface is frequently used by high-level APIs, such as LLama.Web, to ensure consistent configuration across the lifecycle of a model.

Code Entity Relationship Diagram


Sources: LLama/Common/ModelParams.cs13-14 LLama.Web/Common/ModelOptions.cs7-9 LLama/Abstractions/IModelParams.cs15-16 LLama/Abstractions/IContextParams.cs9-10 LLama/Common/InferenceParams.cs12-13 LLama/Abstractions/IInferenceParams.cs10-11

Model Parameters Overview

Model parameters control how GGUF model files are loaded into memory and distributed across hardware.

For details, see Model Parameters (IModelParams).

Sources: LLama/Common/ModelParams.cs1-158 LLama/Abstractions/IModelParams.cs15-91

Context Parameters Overview

Context parameters configure the runtime environment for a loaded model, specifically the KV cache and processing limits.

For details, see Context Parameters (IContextParams).

Sources: LLama/Abstractions/IContextParams.cs1-146 LLama/Extensions/IContextParamsExtensions.cs21-71

Inference Parameters Overview

Inference parameters are passed during generation to control token selection, stopping criteria, and context window management.

For details, see Inference Parameters.

Sources: LLama/Common/InferenceParams.cs12-51 LLama/Abstractions/IInferenceParams.cs10-54

Configuration Serialization

LLamaSharp supports JSON serialization for ModelParams to allow persistence of configuration states.

  • Encoding: Since System.Text.Encoding cannot be directly serialized, LLamaSharp stores the EncodingName string and uses it to restore the Encoding object LLama/Common/ModelParams.cs128-141
  • Complex Types: TensorSplitsCollection and MetadataOverride use custom JSON converters to handle data structures that map to native llama.cpp requirements LLama/Abstractions/IModelParams.cs96-192
  • Round-tripping: Unit tests verify that parameters like BatchSize, ContextSize, and MetadataOverrides are correctly preserved through serialization cycles using System.Text.Json LLama.Unittest/ModelsParamsTests.cs10-55

Sources: LLama/Common/ModelParams.cs128-141 LLama/Abstractions/IModelParams.cs170-185 LLama.Unittest/ModelsParamsTests.cs10-55