VOOZH about

URL: https://deepwiki.com/SciSharp/LLamaSharp/3.4-chat-sessions

⇱ Chat Sessions | SciSharp/LLamaSharp | DeepWiki


Loading...
Last indexed: 18 May 2026 (ecd184)
Menu

Chat Sessions

This page documents the ChatSession high-level API, which composes an executor, conversation history, and text transform pipeline into a unified conversational interface. It covers how history is managed, how prompts are formatted for specific models using templates, how input and output are transformed, and how session state is serialized and restored.

For the underlying stateful executors that ChatSession wraps, see Stateful Executors. For inference parameters (temperature, anti-prompts, etc.) that are passed through to the executor, see Inference Parameters.


Overview

ChatSession sits at the top of the LLamaSharp executor stack. It does not perform inference itself; instead, it orchestrates:

Component Interaction Diagram


Sources: LLama/ChatSession.cs19-72


ChatSession Construction

ChatSession requires a StatefulExecutorBase-derived executor. Passing any other ILLamaExecutor implementation (such as StatelessExecutor) throws an ArgumentException LLama/ChatSession.cs105-108

Constructor / FactoryDescription
ChatSession(ILLamaExecutor executor)Basic construction with empty history LLama/ChatSession.cs102-111
ChatSession(ILLamaExecutor executor, ChatHistory history)Supplies a pre-built history LLama/ChatSession.cs118-122
ChatSession.InitializeSessionFromHistoryAsync(...)Async factory that also prefills the KV cache via PrefillPromptAsync LLama/ChatSession.cs81-96

The static factory InitializeSessionFromHistoryAsync is useful when you want the model to have already "seen" a conversation before the first user turn, avoiding the cost of reprocessing history on the first turn LLama/ChatSession.cs94

Fluent configuration methods modify the session in-place and return this:

MethodEffect
WithHistoryTransform(IHistoryTransform)Replaces the history-to-prompt serializer LLama/ChatSession.cs129-133
AddInputTransform(ITextTransform)Appends a transform to the input pipeline LLama/ChatSession.cs140-144
WithOutputTransform(ITextStreamTransform)Replaces the output stream transform LLama/ChatSession.cs150-155

Sources: LLama/ChatSession.cs74-155


ChatHistory and AuthorRole

ChatHistory holds an ordered list of ChatHistory.Message objects, each with an AuthorRole and a Content string LLama/Common/ChatHistory.cs38-76

Data Structure Space to Code Entity Space


Sources: LLama/Common/ChatHistory.cs9-76

AuthorRole defines the identity of the message source, supporting System (0), User (1), and Assistant (2) LLama/Common/ChatHistory.cs11-32


Prompt Templating and History Transformation

Modern models (LLama 3, Mistral, etc.) require specific formatting (e.g., <|im_start|>user\n...<|im_end|>) to distinguish roles. LLamaSharp provides history transformation to interface with these requirements.

Prompt Generation Pipeline


Sources: LLama/LLamaTransforms.cs66-89 LLama/ChatSession.cs94 LLama/Abstractions/IHistoryTransform.cs17


Performing Inference

ChatAsync

ChatAsync is the primary entry point for a single conversational turn. It takes a new ChatHistory.Message, appends it to the history, and returns an IAsyncEnumerable<string> streaming tokens LLama/ChatSession.cs256-271

Regenerating the Last Response

RegenerateAssistantMessageAsync allows the model to attempt a different response to the same prompt. It removes the last assistant message from history and re-runs the inference LLama/ChatSession.cs280-292

Manual Context Seeding

AddAndProcessUserMessage and AddAndProcessAssistantMessage allow manually adding context to the session and processing it into the KV cache without triggering a full generation cycle LLama/ChatSession.cs338-356

Sources: LLama/ChatSession.cs256-356


Transform Interfaces

IHistoryTransform

Converts between ChatHistory and a raw prompt string.

ITextTransform

Synchronous transform of a single input string. Used in InputTransformPipeline LLama/ChatSession.cs66

ITextStreamTransform

Asynchronously transforms the stream of strings produced during inference.

  • KeywordTextOutputStreamTransform: Removes specified keywords (like "User:" or "Assistant:") from the output to prevent the model from hallucinating the next turn LLama/LLamaTransforms.cs164-188
  • EmptyTextOutputStreamTransform: A no-op transform that returns the stream unchanged LLama/LLamaTransforms.cs145-159

Sources: LLama/LLamaTransforms.cs20-188 LLama/Abstractions/ITextTransform.cs1-31 LLama/Abstractions/ITextStreamTransform.cs1-26


Session State Serialization

ChatSession supports full persistence of the conversation, including the model's KV cache and executor state.

SessionState Files

When calling SaveSession(string path), the following files are created in the directory LLama/ChatSession.cs26-46:

ConstantFilenameDescription
MODEL_STATE_FILENAMEModelState.stBinary KV cache / Context state LLama/ChatSession.cs26
EXECUTOR_STATE_FILENAMEExecutorState.jsonInternal executor counters and state LLama/ChatSession.cs30
HISTORY_STATE_FILENAMEChatHistory.jsonThe JSON-serialized ChatHistory LLama/ChatSession.cs34
INPUT_TRANSFORM_FILENAMEInputTransform.jsonSerialized input pipeline LLama/ChatSession.cs38
OUTPUT_TRANSFORM_FILENAMEOutputTransform.jsonSerialized output transform LLama/ChatSession.cs42
HISTORY_TRANSFORM_FILENAMEHistoryTransform.jsonSerialized history formatter LLama/ChatSession.cs46

State Management Methods

Sources: LLama/ChatSession.cs163-224 LLama.Examples/Examples/ChatSessionWithHistory.cs65-77 LLama.Examples/Examples/ChatSessionWithRestart.cs27-30