VOOZH about

URL: https://deepwiki.com/inclusionAI/AReaL/11.2-weight-update-metadata

⇱ Weight Update Metadata | inclusionAI/AReaL | DeepWiki


Loading...
Last indexed: 7 May 2026 (2e12c1)
Menu

Weight Update Metadata

WeightUpdateMeta is a core data structure that encapsulates metadata for synchronizing model weights from training engines to inference engines during online RL training. It defines how weight updates should be performed—either through disk-based file transfer, distributed NCCL/XCCL communication, or the experimental AWEX protocol—and carries all necessary configuration parameters for the chosen update mechanism, including versioning and LoRA support. areal/api/io_struct.py183-202


Purpose and Architecture

In AReaL's asynchronous RL training paradigm, model weights are periodically synchronized from training workers to inference servers. WeightUpdateMeta serves as the contract between these two subsystems, specifying:

  • Update mechanism type: Supported modes are disk, xccl (NCCL/XCCL-based), or awex (Adaptive Weight Exchange). areal/api/io_struct.py184
  • Communication parameters: Defines NCCL group names, master addresses, and ports for direct GPU-to-GPU transfer. areal/api/io_struct.py188-190
  • Resource allocation: Uses ModelAllocation (via gen_allocation) to determine parallel strategies and rank offsets. areal/api/io_struct.py186
  • LoRA configuration: Carries adapter names, integer IDs for backend indexing, and the full peft_config dictionary. areal/api/io_struct.py193-197
  • Versioning: Tracks weight versions to ensure inference engines use the correct parameters corresponding to specific rollout steps. areal/api/io_struct.py201

Entity Mapping: Metadata to Code

The following diagram maps the logical requirements of weight synchronization to the specific fields in the WeightUpdateMeta class.

Title: Metadata Requirement Mapping


Sources: areal/api/io_struct.py183-202


Weight Update Types

Disk-Based Updates

Disk-based updates serialize model weights to a shared filesystem path. The training engine saves checkpoints, and inference servers reload them. This is often the fallback or mandatory mode for certain LoRA configurations or backends like SGLang when NCCL updates are not supported for specific adapter types. areal/engine/sglang_remote.py133-159

Characteristics:

XCCL/NCCL-Based Updates

NCCL-based updates (referred to as xccl in the metadata) use GPU-direct RDMA for fast weight transfer without intermediate disk I/O. areal/engine/vllm_remote.py150-186

Characteristics:

AWEX (Adaptive Weight Exchange)

AWEX is an experimental weight update mechanism that uses shard-direct NCCL P2P updates. It bypasses full parameter gathering by matching sharding plans between training and inference engines. areal/experimental/weight_update/awex/fsdp_adapter.py38-40

Characteristics:

Sources: areal/api/io_struct.py183-217 areal/engine/sglang_remote.py133-159 areal/experimental/weight_update/awex/fsdp_adapter.py38-40 areal/experimental/weight_update/training_adapter.py10-12


Metadata Structure

Core Fields

FieldTypeDescription
typeLiteral["disk", "xccl", "awex"]The primary synchronization mechanism. areal/api/io_struct.py184
pathstr | NoneBase filesystem path for disk updates. areal/api/io_struct.py185
gen_allocationModelAllocation | NoneAllocation info used to compute rank offsets in distributed groups. areal/api/io_struct.py186
nccl_group_namestr | NoneUnique identifier for the NCCL process group. areal/api/io_struct.py190
versionint | NoneMonotonically increasing version for tracking. areal/api/io_struct.py201

LoRA-Specific Fields

FieldTypeDescription
use_loraboolEnables LoRA-specific synchronization logic. areal/api/io_struct.py193
lora_namestrHuman-readable name for the adapter. areal/api/io_struct.py194
lora_int_idintInteger ID used by vLLM for multi-LoRA indexing. areal/api/io_struct.py195
peft_configdictSerialized PEFT configuration (rank, alpha, target modules). areal/api/io_struct.py197

Sources: areal/api/io_struct.py183-202


Versioning and Path Management

WeightUpdateMeta includes a with_version method to facilitate asynchronous weight updates. This allows the system to maintain multiple versions of weights on disk, preventing race conditions where an inference engine might read a partially written checkpoint while the trainer is writing a new one. areal/api/io_struct.py203-217

Title: Versioned Update Sequence


Sources: areal/api/io_struct.py161-163 areal/api/io_struct.py203-217


Backend-Specific Implementations

Different inference backends consume WeightUpdateMeta to construct their specific HTTP requests for weight loading.

SGLang Backend

SGLang uses build_disk_weight_update_requests and build_distributed_weight_update_requests to translate metadata into /load_lora_adapter, /update_weights_from_disk, or /update_weights_from_distributed endpoints. areal/engine/sglang_remote.py129-187

vLLM Backend

vLLM follows a similar pattern but includes a two-step distributed update process: setting metadata via /areal_set_update_weight_meta followed by the actual transfer via /areal_update_weights_xccl. For LoRA, it uses /areal_set_update_weight_meta_lora and /areal_update_weights_lora_xccl. areal/engine/vllm_remote.py150-196

AWEX Adapter Flow

The following diagram illustrates how the WeightUpdateMeta fields are consumed by AWEX adapters to establish communication.

Title: AWEX Metadata Consumption


Sources: areal/api/io_struct.py183-202 areal/engine/sglang_remote.py129-187 areal/engine/vllm_remote.py150-196 areal/experimental/weight_update/awex/fsdp_adapter.py118-161


Usage in Configuration

The weight update strategy is defined in the actor configuration. For LoRA-based training (e.g., GRPO on GSM8K), the weight_update_mode can be set to disk or xccl depending on backend support. examples/math/gsm8k_grpo_lora.yaml81-87


When use_lora is enabled, the gconfig (Generation Hyperparameters) also references the lora_name which is used to construct the versioned metadata. examples/math/gsm8k_grpo_lora.yaml36-42

Sources: examples/math/gsm8k_grpo_lora.yaml36-42 examples/math/gsm8k_grpo_lora.yaml81-91