VOOZH about

URL: https://deepwiki.com/inclusionAI/AReaL/1.4-architecture-overview

⇱ Architecture Overview | inclusionAI/AReaL | DeepWiki


Loading...
Last indexed: 7 May 2026 (2e12c1)
Menu

Architecture Overview

AReaL is a distributed asynchronous reinforcement learning system for training large language models. The system is organized into distinct layers that enable flexible configuration, multiple backend support, and scalable distributed execution.

Core Design Principles:

System Architecture Layers:

Title: System-wide Component Interaction


Sources: README.md15-34 areal/api/engine_api.py32-140 areal/api/workflow_api.py14-39 areal/api/scheduler_api.py14-32 areal/api/scheduler_api.py43-49

Core Components and APIs

The system is built around abstract interfaces that enable pluggable implementations for different backends and deployment environments.

API Layer: Abstract Interfaces and Implementations

Title: Code Entity Relationship Diagram


Key API Contracts:

APIAbstract ClassKey MethodsImplementations
TrainingTrainEnginetrain_batch(), update_weights()FSDPEngine, ArchonEngine, MegatronEngine areal/api/engine_api.py32-140
InferenceInferenceEngineagenerate(), update_weights_from_distributed()RemoteInfEngine areal/api/engine_api.py255-320
WorkflowRolloutWorkflowarun_episode()RLVRWorkflow, VisionRLVRWorkflow, TIRWorkflow areal/api/workflow_api.py14-39
SchedulingSchedulercreate_workers(), create_engine()LocalScheduler, RayScheduler, SlurmScheduler areal/api/scheduler_api.py43-49

Sources: areal/api/engine_api.py32-320 areal/api/workflow_api.py14-39 areal/api/scheduler_api.py43-49

Training Pipeline Data Flow

The training pipeline alternates between rollout collection (inference) and policy updates (training). AReaL uses a sophisticated micro-batching and sequence packing system to optimize throughput.

Training Loop Sequence

Title: Sequence of Distributed Method Invocations


Data Handling Structures: The system relies on specialized I/O structures to communicate between components:

Sources: areal/api/io_struct.py28-201 areal/api/engine_api.py232-253

Distributed Worker Management

The scheduler system manages worker creation, engine initialization, and communication group setup across distributed environments.

Worker Creation and Engine Initialization

Title: Resource Allocation and Worker Lifecycle


The system uses ModelAllocation to define backend types and parallel strategies areal/api/io_struct.py16-17 The Scheduler (Local, Ray, or Slurm) creates Worker objects representing remote processes areal/api/scheduler_api.py14-32 Engines are then instantiated on these workers via create_engine() areal/api/scheduler_api.py182-209

Sources: areal/api/scheduler_api.py14-209 areal/api/io_struct.py16-17 areal/api/engine_api.py34-42

Multi-Backend Engine Architecture

AReaL supports multiple training and inference backends through unified interfaces.

Inference Backend Comparison:

FeatureSGLangBackendVLLMBackend
Implementationareal/engine/sglang_remote.pyareal/engine/vllm_remote.py
Primary Methodbuild_generation_request() areal/engine/sglang_remote.py43build_generation_request() areal/engine/vllm_remote.py44
Weight SyncDisk & Distributed (NCCL) areal/engine/sglang_remote.py129-160Disk & Distributed (NCCL/XCCL) areal/engine/vllm_remote.py129-154
LoRA SupportYes (Disk-based) areal/engine/sglang_remote.py133-146Yes (Disk & XCCL) areal/engine/vllm_remote.py133-141
SpecialtyHigh throughput via SGLang runtime areal/engine/sglang_remote.py40Broad model support and OpenAI API compatibility areal/engine/vllm_remote.py41

Inference Request Pattern: The RemoteInfEngine uses these backends to construct HTTP payloads for remote servers. SGLang uses a nested sampling_params structure areal/engine/sglang_remote.py56 while vLLM uses a flat payload for its completions API areal/engine/vllm_remote.py52

Sources: areal/engine/sglang_remote.py40-187 areal/engine/vllm_remote.py41-183

Asynchronous Training and Weight Versioning

AReaL's asynchronous training mode enables rollout generation to run concurrently with policy updates. The system uses weight versioning to manage synchronization.

Weight Synchronization Flow:

  1. The TrainEngine performs optimization steps and maintains a versioned state.
  2. WeightUpdateMeta is used to communicate version numbers and storage paths to inference engines areal/api/io_struct.py183-201
  3. The RemoteInfEngine triggers weight updates on the remote server via specific endpoints:
  4. Versioning allows the system to load specific LoRA adapter versions using get_versioned_lora_name areal/api/io_struct.py161-163

Sources: areal/api/io_struct.py161-201 areal/engine/sglang_remote.py129-187 areal/engine/vllm_remote.py129-183