Last indexed: 7 May 2026 (2e12c1)

Architecture Overview

AReaL is a distributed asynchronous reinforcement learning system for training large language models. The system is organized into distinct layers that enable flexible configuration, multiple backend support, and scalable distributed execution.

Core Design Principles:

Asynchronous RL: Decouples trajectory generation (rollout) from policy updates (training) on separate GPU pools, achieving significant speedups README.md15-34
Multi-Backend Support: Training backends (FSDPEngine, MegatronEngine, ArchonEngine) and inference backends (SGLangBackend, vLLMBackend) areal/api/engine_api.py32-140 areal/engine/sglang_remote.py40-41 areal/engine/vllm_remote.py41-42
Distributed Execution: Scheduler abstraction (LocalScheduler, RayScheduler, SlurmScheduler) for multi-node clusters areal/api/scheduler_api.py43-49
Workflow-Based: RolloutWorkflow interface for defining custom episode logic, supporting standard RL, agentic scenarios, and multi-turn interactions areal/api/workflow_api.py14-18

System Architecture Layers:

Title: System-wide Component Interaction

Sources: README.md15-34 areal/api/engine_api.py32-140 areal/api/workflow_api.py14-39 areal/api/scheduler_api.py14-32 areal/api/scheduler_api.py43-49

Core Components and APIs

The system is built around abstract interfaces that enable pluggable implementations for different backends and deployment environments.

API Layer: Abstract Interfaces and Implementations

Title: Code Entity Relationship Diagram

Key API Contracts:

API	Abstract Class	Key Methods	Implementations
Training	`TrainEngine`	`train_batch()`, `update_weights()`	`FSDPEngine`, `ArchonEngine`, `MegatronEngine` areal/api/engine_api.py32-140
Inference	`InferenceEngine`	`agenerate()`, `update_weights_from_distributed()`	`RemoteInfEngine` areal/api/engine_api.py255-320
Workflow	`RolloutWorkflow`	`arun_episode()`	`RLVRWorkflow`, `VisionRLVRWorkflow`, `TIRWorkflow` areal/api/workflow_api.py14-39
Scheduling	`Scheduler`	`create_workers()`, `create_engine()`	`LocalScheduler`, `RayScheduler`, `SlurmScheduler` areal/api/scheduler_api.py43-49

Sources: areal/api/engine_api.py32-320 areal/api/workflow_api.py14-39 areal/api/scheduler_api.py43-49

Training Pipeline Data Flow

The training pipeline alternates between rollout collection (inference) and policy updates (training). AReaL uses a sophisticated micro-batching and sequence packing system to optimize throughput.

Training Loop Sequence

Title: Sequence of Distributed Method Invocations

Data Handling Structures: The system relies on specialized I/O structures to communicate between components:

ModelRequest: Encapsulates input IDs, generation config, and metadata for inference areal/api/io_struct.py28-40
ModelResponse: Contains output tokens, logprobs, and MoE routing information areal/api/io_struct.py63-83
WeightUpdateMeta: Defines how weights are transferred (disk vs. NCCL) and versioning info areal/api/io_struct.py183-201

Sources: areal/api/io_struct.py28-201 areal/api/engine_api.py232-253

Distributed Worker Management

The scheduler system manages worker creation, engine initialization, and communication group setup across distributed environments.

Worker Creation and Engine Initialization

Title: Resource Allocation and Worker Lifecycle

The system uses ModelAllocation to define backend types and parallel strategies areal/api/io_struct.py16-17 The Scheduler (Local, Ray, or Slurm) creates Worker objects representing remote processes areal/api/scheduler_api.py14-32 Engines are then instantiated on these workers via create_engine() areal/api/scheduler_api.py182-209

Sources: areal/api/scheduler_api.py14-209 areal/api/io_struct.py16-17 areal/api/engine_api.py34-42

Multi-Backend Engine Architecture

AReaL supports multiple training and inference backends through unified interfaces.

Inference Backend Comparison:

Feature	`SGLangBackend`	`VLLMBackend`
Implementation	`areal/engine/sglang_remote.py`	`areal/engine/vllm_remote.py`
Primary Method	`build_generation_request()` areal/engine/sglang_remote.py43	`build_generation_request()` areal/engine/vllm_remote.py44
Weight Sync	Disk & Distributed (NCCL) areal/engine/sglang_remote.py129-160	Disk & Distributed (NCCL/XCCL) areal/engine/vllm_remote.py129-154
LoRA Support	Yes (Disk-based) areal/engine/sglang_remote.py133-146	Yes (Disk & XCCL) areal/engine/vllm_remote.py133-141
Specialty	High throughput via SGLang runtime areal/engine/sglang_remote.py40	Broad model support and OpenAI API compatibility areal/engine/vllm_remote.py41

Inference Request Pattern: The RemoteInfEngine uses these backends to construct HTTP payloads for remote servers. SGLang uses a nested sampling_params structure areal/engine/sglang_remote.py56 while vLLM uses a flat payload for its completions API areal/engine/vllm_remote.py52

Sources: areal/engine/sglang_remote.py40-187 areal/engine/vllm_remote.py41-183

Asynchronous Training and Weight Versioning

AReaL's asynchronous training mode enables rollout generation to run concurrently with policy updates. The system uses weight versioning to manage synchronization.

Weight Synchronization Flow:

The TrainEngine performs optimization steps and maintains a versioned state.
WeightUpdateMeta is used to communicate version numbers and storage paths to inference engines areal/api/io_struct.py183-201
The RemoteInfEngine triggers weight updates on the remote server via specific endpoints:
- SGLang: /update_weights_from_disk or /update_weights_from_distributed areal/engine/sglang_remote.py149-177
- vLLM: /areal_update_weights or /areal_update_weights_xccl areal/engine/vllm_remote.py143-183
Versioning allows the system to load specific LoRA adapter versions using get_versioned_lora_name areal/api/io_struct.py161-163

Sources: areal/api/io_struct.py161-201 areal/engine/sglang_remote.py129-187 areal/engine/vllm_remote.py129-183

Refresh this wiki

URL: https://deepwiki.com/inclusionAI/AReaL/1.4-architecture-overview