VOOZH about

URL: https://deepwiki.com/inclusionAI/AReaL/1.2-setup-and-installation

⇱ Setup and Installation | inclusionAI/AReaL | DeepWiki


Loading...
Last indexed: 7 May 2026 (2e12c1)
Menu

Setup and Installation

This page covers the complete installation process for AReaL, including hardware prerequisites, environment setup using uv, Docker usage, NPU support, and environment validation. For conceptual information about AReaL's purpose and features, see 1.1. What is AReaL For step-by-step instructions on running your first training job, see 1.3. Quick Start Guide

Prerequisites

Hardware Requirements

The following configuration is optimized for large-scale reinforcement learning workloads using AReaL's asynchronous rollout architecture.

ComponentSpecification (NVIDIA)Specification (Ascend NPU)
Accelerator8x H800 per node docs/en/tutorial/installation.md916x NPU per node docs/en/tutorial/installation_npu.md9
CPU64 cores per node docs/en/tutorial/installation.md1064 cores per node docs/en/tutorial/installation_npu.md10
Memory1TB per node docs/en/tutorial/installation.md111TB per node docs/en/tutorial/installation_npu.md11
NetworkNVSwitch + RoCE 3.2 Tbps docs/en/tutorial/installation.md12RoCE 3.2 Tbps docs/en/tutorial/installation_npu.md12
Storage (Local)1TB for single-node docs/en/tutorial/installation.md141TB for single-node docs/en/tutorial/installation_npu.md14
Shared Storage10TB NAS for distributed docs/en/tutorial/installation.md1510TB NAS for distributed docs/en/tutorial/installation_npu.md15

Sources: docs/en/tutorial/installation.md5-15 docs/en/tutorial/installation_npu.md5-15

Software Requirements

ComponentVersionNotes
Operating SystemUbuntu 22.04 / CentOS 7Linux x86_64 is the primary target pyproject.toml35
Python3.11 - 3.12Enforced by requires-python pyproject.toml10
NVIDIA Driver550.127.08Tested version docs/en/tutorial/installation.md22
CUDA12.8 / 12.9Required for training/inference backends Dockerfile63
uv0.9.18+Required for dependency management pyproject.toml2

Sources: pyproject.toml2-10 Dockerfile9-63 docs/en/tutorial/installation.md19-27

Installation Decision Flow

The following diagram bridges high-level installation choices to specific tools and configurations used in the codebase.


Sources: Dockerfile1-13 pyproject.toml148-182 docs/en/tutorial/installation.md39-54

Option 1: Docker Runtime (Recommended)

The official Docker image includes pre-compiled CUDA packages and heavy C++ extensions (e.g., flash-attn, apex, TransformerEngine) which are slow to build from source.

Build Variants

AReaL supports two primary inference backends via the VARIANT build argument in the Dockerfile Dockerfile1-15:

Launching the Container

The container requires high shared memory and GPU access for distributed training.


Sources: docs/en/tutorial/installation.md45-50 Dockerfile1-15

Option 2: Custom Environment with uv

For development or non-containerized deployments, use the uv package manager. uv handles complex dependency resolution and platform markers defined in pyproject.toml pyproject.toml220-228

Dependency Synchronization

Use uv sync with optional extras to install the correct backend:

  • uv sync --extra cuda: Installs cuda-train packages (Megatron, TMS) + sglang + flash-attn pyproject.toml176-182
  • vLLM Setup: Because sglang and vllm pin mutually-incompatible torch versions, vLLM requires swapping the project file pyproject.vllm.toml3-15:
    
    
  • Syncing Lockfiles: The scripts/uv_lock.sh script is used to generate and update both uv.lock and uv.vllm.lock to ensure consistency across variants scripts/uv_lock.sh1-51

Pre-commit Hooks

After environment setup, install hooks to ensure code quality:


Sources: pyproject.toml193-210 docs/en/tutorial/installation.md110-118 scripts/uv_lock.sh1-51

NPU Installation (Ascend)

NPU support requires specific images and CANN versions. Currently, the fsdp training engine and vllm rollout engine (via vLLM-Ascend) are supported docs/en/tutorial/installation_npu.md144-146

NPU Docker Setup

Use dedicated images for Ascend hardware (A2 or A3 variants):


Sources: docs/en/tutorial/installation_npu.md42-109

Package Structure and Build System

The following diagram maps the logical dependency groups to the pyproject.toml structure and the Docker build stages.


Manual C++ Extension Installation

If not using Docker, several optimized kernels must be built from source:

Sources: Dockerfile61-165 pyproject.toml148-182 uv.lock22-37

Installation Validation

Verify your installation using the provided scripts. These check for package imports, version compatibility, and accelerator availability.

Automated Checks

  • python3 areal/tools/validate_installation.py: Basic check for core dependencies.
  • areal/tools/check_pyproject_consistency.py: Ensures consistency between pyproject.toml and its vLLM variant. It identifies "escapable" packages like torch or vllm that are allowed to differ areal/tools/check_pyproject_consistency.py1-46

Hardware Check

Ensure your GPU is visible to PyTorch:


Sources: docs/en/tutorial/installation.md180-195 areal/tools/check_pyproject_consistency.py1-46

Versioning and Metadata

The system version and project metadata are managed through pyproject.toml.

Metadata FieldCode ReferenceDescription
versionpyproject.toml:11Semantic version (e.g., 1.0.4)
requires-pythonpyproject.toml:10Version constraint >=3.11, <3.13
dependenciespyproject.toml:43-139Core project requirements

Sources: pyproject.toml5-139