VOOZH about

URL: https://deepwiki.com/inclusionAI/AReaL/11.1-request-and-response-types

⇱ Request and Response Types | inclusionAI/AReaL | DeepWiki


Loading...
Last indexed: 7 May 2026 (2e12c1)
Menu

Request and Response Types

This page documents the core data structures used for communication between inference engines and workflows in AReaL. These types define the contract for model inference requests and responses, supporting both text-only and vision-language models (VLM).

Scope: This page focuses on the request/response structures themselves—their fields, methods, and data formats. For information about how these structures are used in workflows, see Workflow and Rollout System. For HTTP-based communication protocols with remote inference servers, see SGLang Backend and vLLM Backend.


Overview

AReaL's inference system uses strongly-typed dataclasses to represent inference requests and responses. The primary types are:

TypePurposeKey Fields
ModelRequestInference request sent to enginesinput_ids, gconfig, image_data
ModelResponseInference response from enginesoutput_tokens, output_logprobs, stop_reason
WeightUpdateMetaMetadata for weight synchronizationtype, path, use_lora, version
HttpRequestHTTP request wrapper for remote serversendpoint, payload, method
HttpGenerationResultParsed HTTP generation responseoutput_tokens, output_logprobs

These types are defined in areal/api/io_struct.py27-376 and serve as the foundation for all inference operations in rollout workflows.

Sources: areal/api/io_struct.py27-376


ModelRequest Structure

Request Flow Diagram


Sources: areal/api/io_struct.py28-59 areal/engine/sglang_remote.py40-41 areal/engine/vllm_remote.py41-42 areal/api/workflow_api.py14-18

Field Reference

The ModelRequest dataclass is defined at areal/api/io_struct.py28-59:

FieldTypeDefaultPurpose
ridstrUUIDUnique request identifier for tracking areal/api/io_struct.py29
input_idslist[int][]Tokenized input sequence areal/api/io_struct.py30
gconfigGenerationHyperparametersdefault instanceSampling parameters areal/api/io_struct.py31-33
metadatadict[str, Any]{}Arbitrary metadata passed through workflow areal/api/io_struct.py34
tokenizerPreTrainedTokenizerFastNoneTokenizer for encode/decode operations areal/api/io_struct.py36
image_datalist[str][]Base64-encoded images (VLM only) areal/api/io_struct.py39
processorAutoProcessorNoneProcessor for multi-modal inputs areal/api/io_struct.py40
vision_msg_vllmlistNonevLLM-specific vision message format areal/api/io_struct.py43

Generation Configuration: The gconfig field references GenerationHyperparameters which includes sampling parameters like temperature, top_p, n_samples, and max_new_tokens areal/api/cli_args.py17 These are typically configured in YAML files under the gconfig block examples/math/gsm8k_grpo_lora.yaml36-42

Vision-Language Model Support: The image_data and vision_msg_vllm fields enable multi-modal inference. SGLang uses image_data directly in its payload areal/engine/sglang_remote.py71 while vLLM constructs chat messages from vision_msg_vllm and image_data, automatically detecting MIME types areal/engine/vllm_remote.py74-91

Sources: areal/api/io_struct.py28-59 areal/engine/sglang_remote.py71 areal/engine/vllm_remote.py74-91 examples/math/gsm8k_grpo_lora.yaml36-42

Request Methods

copy()

Creates a deep copy of the request with independent field values areal/api/io_struct.py45-59:


This method is used when workflows need to create multiple similar requests (e.g., for sampling multiple responses with n_samples > 1).

Sources: areal/api/io_struct.py45-59


ModelResponse Structure

Response Flow Diagram


Sources: areal/api/io_struct.py63-131 areal/engine/sglang_remote.py91-127 areal/engine/vllm_remote.py98-127 areal/api/workflow_api.py14-18

Field Reference

The ModelResponse dataclass is defined at areal/api/io_struct.py63-131:

FieldTypeDefaultPurpose
input_tokenslist[int][]Echo of input token IDs areal/api/io_struct.py65
output_tokenslist[int][]Generated output token IDs areal/api/io_struct.py66
output_logprobslist[float][]Log probabilities for each token areal/api/io_struct.py67
output_versionslist[int][]Model version used for generation areal/api/io_struct.py68
stop_reasonLiteral"stop"Termination reason areal/api/io_struct.py69
tokenizerPreTrainedTokenizerFastNoneTokenizer for decode operations areal/api/io_struct.py71
input_imageslist[ImageObject|str][]Input images (VLM only) areal/api/io_struct.py74
latencyfloatinfTotal request latency areal/api/io_struct.py78
ttftfloatinfTime to first token areal/api/io_struct.py79
itllist[float][]Inter-token latencies areal/api/io_struct.py80
routed_expertsnp.ndarrayNoneMoE routing information areal/api/io_struct.py83

Version Tracking: The output_versions field tracks model weights versions, enabling analysis of weight staleness in asynchronous training where rollouts and training happen concurrently.

Stop Reason Types:

Sources: areal/api/io_struct.py63-83

Response Properties

input_len and output_len

Simple length accessors areal/api/io_struct.py85-91:


end_with_stop

Checks if the output ends with an EOS or PAD token areal/api/io_struct.py93-104:


output_tokens_without_stop

Returns output tokens with trailing EOS/PAD tokens removed areal/api/io_struct.py107-130 This is critical for preparing training data as stop tokens should typically not receive gradients during the RL update.

Sources: areal/api/io_struct.py85-131


Weight Update and Checkpoint Structures

WeightUpdateMeta

The WeightUpdateMeta structure defines how model weights are synchronized from training to inference engines areal/api/io_struct.py183-244

FieldTypePurpose
type"disk" | "xccl" | "awex"Transfer mechanism areal/api/io_struct.py184
pathstr | NonePath for disk-based updates areal/api/io_struct.py185
use_loraboolWhether this is a LoRA update areal/api/io_struct.py193
lora_namestrAdapter name areal/api/io_struct.py194
versionint | NoneWeight version index areal/api/io_struct.py201

The with_version method creates a copy of the metadata with a versioned path (e.g., weight_update_v1) areal/api/io_struct.py203-215

Sources: areal/api/io_struct.py183-215

ParamSpec

Describes a single model parameter for distributed weight updates areal/api/io_struct.py150-159

FieldTypePurpose
namestrParameter name areal/api/io_struct.py151
shapetupleTensor dimensions areal/api/io_struct.py152
dtypestrData type string areal/api/io_struct.py153

The size property calculates the byte size of the parameter based on its shape and data type areal/api/io_struct.py155-158

Sources: areal/api/io_struct.py150-159


HTTP Request Structures

HttpRequest and HttpGenerationResult

For remote inference servers accessed via HTTP, AReaL uses wrapper structures defined in areal/api/io_struct.py


Sources: areal/api/io_struct.py278-294 areal/engine/sglang_remote.py89 areal/engine/vllm_remote.py93-96

HttpRequest

Defined at areal/api/io_struct.py278-284:

FieldTypeDefaultPurpose
endpointstr(required)API endpoint path areal/api/io_struct.py279
payloaddict[str, Any](required)Request payload areal/api/io_struct.py280
methodstr"POST"HTTP method areal/api/io_struct.py281

HttpGenerationResult

Defined at areal/api/io_struct.py287-294:

FieldTypeDefaultPurpose
output_tokenslist[int](required)Generated token IDs areal/api/io_struct.py288
output_logprobslist[float](required)Log probabilities areal/api/io_struct.py289
stop_reasonstr(required)Termination reason areal/api/io_struct.py290
routed_expertsnp.ndarrayNoneMoE routing (optional) areal/api/io_struct.py291

Sources: areal/api/io_struct.py278-294


MoE Routing Information

For Mixture-of-Experts (MoE) models, ModelResponse.routed_experts contains routing decisions areal/api/io_struct.py83

In the SGLang backend, routing information is extracted from the meta_info["routed_experts"] field of the engine response, decoded from base64, and reshaped to (num_tokens, num_layers * expert_top_k) areal/engine/sglang_remote.py101-109

Sources: areal/api/io_struct.py83 areal/engine/sglang_remote.py101-109