VOOZH about

URL: https://deepwiki.com/SciSharp/LLamaSharp/1.2-installation-and-setup

⇱ Installation and Setup | SciSharp/LLamaSharp | DeepWiki


Loading...
Last indexed: 18 May 2026 (ecd184)
Menu

Installation and Setup

This page guides you through installing LLamaSharp and its dependencies, selecting the appropriate backend for your hardware, and configuring your project for first use. For information about using LLamaSharp APIs after installation, see the Quick Start Guide (1.3). For details on the package architecture, see Package Architecture (1.1).


Prerequisites

LLamaSharp requires one of the following .NET target frameworks:

Hardware Considerations:

  • Minimum: x64 or ARM64 CPU LLama/LLamaSharp.csproj7
  • Recommended: GPU with CUDA 11/12 support or Vulkan support for accelerated inference README.md101-103
  • RAM: Varies by model size (4GB minimum for quantized 7B models, 16GB+ for larger models).

Sources: LLama/LLamaSharp.csproj3-7 README.md86-106


Package Installation Overview

LLamaSharp uses a modular package distribution strategy that separates the managed C# API from platform-specific native binaries. This design minimizes package size while supporting diverse hardware configurations.

Package Ecosystem Structure


Figure 1: LLamaSharp Package Dependencies

Sources: README.md5-10 LLama/LLamaSharp.csproj30 LLama.SemanticKernel/LLamaSharp.SemanticKernel.csproj31 LLama.KernelMemory/LLamaSharp.KernelMemory.csproj25


Step 1: Install Core Package

Install the LLamaSharp NuGet package, which contains the managed C# API:

Package Manager Console:

PM> Install-Package LLamaSharp

dotnet CLI:


.csproj Reference:


This package provides all managed types including LLamaWeights, LLamaContext, executors, and sampling APIs. The current version is 0.27.0, based on llama.cpp commit 3f7c29d318e317b63f54c558bc69803963d7d88c. LLama/LLamaSharp.csproj10-25

Sources: LLama/LLamaSharp.csproj10-30 README.md92-96


Step 2: Select and Install Backend Package

Backend packages contain compiled native libraries (.dll, .so, .dylib) for specific hardware configurations. You must install exactly one backend package that matches your target platform and acceleration requirements. README.md88-90

Backend Selection Matrix

Backend PackagePlatformsHardware AccelerationUse Case
LLamaSharp.Backend.CpuWindows x64/ARM64
Linux x64/ARM64
macOS x64/ARM64
CPU (AVX/AVX2/AVX512)
Metal (macOS ARM64)
General CPU inference, macOS GPU
LLamaSharp.Backend.Cuda11Windows x64
Linux x64
NVIDIA CUDA 11.xNVIDIA GPUs with CUDA 11
LLamaSharp.Backend.Cuda12Windows x64
Linux x64
NVIDIA CUDA 12.xNVIDIA GPUs with CUDA 12
LLamaSharp.Backend.VulkanWindows x64
Linux x64
VulkanCross-vendor GPU support

Sources: README.md98-103


Native Library Configuration

Automatic Selection and Loading

LLamaSharp manages native library loading via NativeLibraryConfig. By default, the library attempts to detect the best available backend based on system capabilities like AVX level and GPU availability. LLama/Native/Load/NativeLibraryConfig.cs10-15


Figure 2: Native Library Selection Logic

Sources: LLama/Native/Load/NativeLibraryUtils.cs15-36 LLama/Native/Load/NativeLibraryConfig.cs150-161

Manual Configuration

You can override the automatic selection by calling NativeLibraryConfig methods before any model loading: LLama/Native/Load/NativeLibraryConfig.cs10-15


Sources: LLama/Native/Load/NativeLibraryConfig.cs38-101

Manual Dependency Loading

Since llama.cpp binaries were split, LLamaSharp manually loads dependencies like ggml-base, ggml-cpu, and ggml-cuda to ensure compatibility across different runtime directories. LLama/Native/Load/NativeLibraryUtils.cs48-52

The loading sequence for dependencies is managed within NativeLibraryUtils.TryLoadLibrary: LLama/Native/Load/NativeLibraryUtils.cs15

  1. ggml-base: Always loaded from the current runtime directory. LLama/Native/Load/NativeLibraryUtils.cs67
  2. Platform Backends:
  3. ggml: The main GGML entry point. LLama/Native/Load/NativeLibraryUtils.cs117

Figure 3: Native Dependency Loading Order

Sources: LLama/Native/Load/NativeLibraryUtils.cs63-123


Step 3: (Optional) Install Integration Packages

Semantic Kernel Integration

For applications using Microsoft Semantic Kernel, install the integration package:


This package targets netstandard2.0 and net8.0. LLama.SemanticKernel/LLamaSharp.SemanticKernel.csproj4

Sources: LLama.SemanticKernel/LLamaSharp.SemanticKernel.csproj1-31

Kernel Memory Integration

For RAG (Retrieval Augmented Generation) scenarios using Microsoft Kernel Memory:


This package targets net8.0. LLama.KernelMemory/LLamaSharp.KernelMemory.csproj4

Sources: LLama.KernelMemory/LLamaSharp.KernelMemory.csproj1-25


Model Preparation

LLamaSharp requires models in GGUF format. README.md110

Obtaining GGUF Models

Search Hugging Face for {model-name} gguf. Popular models like Llama-3.2 and Mistral are widely available in this format.

Basic Project Configuration

A typical project should enable AllowUnsafeBlocks to support high-performance native interop. LLama/LLamaSharp.csproj8

Example Configuration in .csproj:


Sources: LLama/LLamaSharp.csproj1-33