VOOZH about

URL: https://deepwiki.com/inclusionAI/AReaL/12.1-project-structure

⇱ Project Structure | inclusionAI/AReaL | DeepWiki


Loading...
Last indexed: 7 May 2026 (2e12c1)
Menu

Project Structure

This page describes the physical organization of the AReaL repository, including directory layout, package structure, build system configuration, and development artifacts. For information about the logical system architecture and component interactions, see Architecture Overview

Repository Overview

The AReaL repository follows a flat layout with the core Python package at the top level, alongside configuration files, documentation, examples, and build artifacts.

High-Level Directory Structure

Title: Repository Root Layout


Top-Level Organization

PathPurposeKey Contents
areal/Core Python packageAPI definitions, engines, trainers, workflows, utilities
examples/Training scripts and launcher recipesGSM8K, VLM, Agent workflows, Scaffolding integration
docs/Jupyter Book documentation sourceEnglish (en/) and Chinese (zh/) tutorials
tests/Unit and integration test suitepytest infrastructure and torchrun entry points
pyproject.tomlPackage metadata and dependenciesProject config, dependency groups, SGLang default
pyproject.vllm.tomlvLLM variant configurationAlternative configuration for vLLM-based stacks
uv.lock / uv.vllm.lockDeterministic dependency resolutionPlatform-specific markers for Linux/macOS/NPU
DockerfileRuntime environment definitionMulti-stage build for distributed GPU/NPU execution

Sources: AGENTS.md60-76 CLAUDE.md10-26 CONTRIBUTING.md75-78 pyproject.toml5-139 pyproject.vllm.toml21-156

Core Package Structure (areal/)

The areal/ package contains all runtime code organized into functional subsystems. This modular structure enables independent development of engines, workflows, and datasets while maintaining clear API boundaries.

Title: AReaL Package Subsystems


Subsystem Breakdown

areal/api/ - Contracts and Configs

Contains the definition of configuration dataclasses (e.g., GRPOConfig) and abstract interfaces (contracts) for engines and workflows. This is the source of truth for CLI arguments defined in areal/api/cli_args.py. Sources: CLAUDE.md13 AGENTS.md49 AGENTS.md86

areal/engine/ - Execution Backends

Implements the training and inference adapters. It includes fsdp_utils/ for FSDP2-specific logic (checkpoint, parallel mesh) and megatron_utils/ for Megatron/FP8 utilities. Sources: CLAUDE.md14-17 AGENTS.md64

areal/infra/ - Orchestration Infrastructure

Manages the lifecycle of distributed jobs. It includes launcher/ and scheduler/ for local, Ray, and Slurm execution, as well as the RPC system for remote engine invocation. It also houses the data_service/ for distributed data loading. Sources: CLAUDE.md18-19 AGENTS.md66

areal/experimental/ - Prototype Systems

Houses cutting-edge features including the Archon MoE engine and microservice-based architectures for inference_service and agent_service. Sources: CLAUDE.md103 AGENTS.md65

Package Entry Points

Title: From Script to Execution Engine


Sources: AGENTS.md84-89 CLAUDE.md100-108 CONTRIBUTING.md128-135

Build System and Configuration

Package Management with uv

AReaL uses uv for deterministic dependency management. The project maintains a dual-lockfile system to handle differing requirements for SGLang and vLLM backends, as they pin incompatible versions of torch and torchao.

Installation Workflows:

  • SGLang (Default): uv sync --extra cuda AGENTS.md11 CLAUDE.md42
  • vLLM Variant: cp pyproject.vllm.toml pyproject.toml && cp uv.vllm.lock uv.lock && uv sync --extra cuda AGENTS.md11 CLAUDE.md43
  • NPU Support: Stable support for Ascend NPU devices is maintained in the ascend branch. README.md82-87

Sources: CLAUDE.md39-44 AGENTS.md9-15 pyproject.toml1-4 pyproject.vllm.toml1-15

Development Workflow and Quality Tools

The project enforces strict code quality through pre-commit hooks and automated CI/CD.

ToolPurposeFiles Targeted
RuffPython Linting and Formattingareal/, tests/, examples/
Clang-formatC++/CUDA Formatting.cu, .cuh, .cpp
MdformatMarkdown Formattingdocs/, *.md
Conventional CommitsCommit Message ValidationGit commit messages

Sources: AGENTS.md14-15 CONTRIBUTING.md32-39 CLAUDE.md47-50

CI/CD Pipeline

The GitHub Actions workflow automates the validation process:

  1. Pre-commit Checks: Runs formatting and linting on every PR. CONTRIBUTING.md106-112
  2. Test Suite: Triggered by the safe-to-test label; runs on GCP with A100 GPUs. CONTRIBUTING.md117-124
  3. Image Building: Multi-stage Docker builds (defined in Dockerfile) for testing and promotion to :dev tags. Supports VARIANT=sglang and VARIANT=vllm. Dockerfile1-15 CONTRIBUTING.md166-205

Sources: CONTRIBUTING.md104-205 Dockerfile1-15


This page covered the physical repository structure, package organization, and the build system. For information about the specific logic of RL algorithms, see Algorithm Overview For infrastructure details, see Scheduler API