VOOZH about

URL: https://deepwiki.com/inclusionAI/AReaL/2.1-configuration-overview

⇱ Configuration Overview | inclusionAI/AReaL | DeepWiki


Loading...
Last indexed: 7 May 2026 (2e12c1)
Menu

Configuration Overview

This page explains AReaL's configuration architecture, including how configurations are defined, loaded, composed, and distributed across the system. For detailed parameter references, see 2.2 CLI Arguments Reference For specific configuration types, see 2.4 Training Engine Configurations and 2.5 Inference Engine Configurations

Configuration Architecture

AReaL uses a dataclass-based configuration system that combines YAML files with command-line overrides through Hydra and OmegaConf. All configuration parameters are defined as Python dataclasses in areal/api/cli_args.py, providing type safety, auto-documentation, and IDE support areal/api/cli_args.py42-1130

Key Design Principles

  1. Single Source of Truth: All configuration parameters are defined as dataclass fields with metadata including help text, default values, and validation constraints areal/api/cli_args.py42-1130
  2. Hierarchical Composition: Complex configurations are composed from simpler configuration dataclasses. For example, PPOConfig contains PPOActorConfig, InferenceEngineConfig, and GenerationHyperparameters areal/api/cli_args.py2319-2367
  3. YAML + CLI Overrides: Base configurations are stored in YAML files, with selective overrides via command line using Hydra's dot-notation syntax (e.g., actor.lr=1e-4) areal/api/cli_args.py12-15
  4. Type Safety: OmegaConf validates types at runtime and provides structured access to configuration values areal/api/cli_args.py15-17
  5. Auto-Documentation: The system uses structured metadata in dataclasses to generate CLI reference documentation docs/en/cli_reference.md1-21

Configuration Loading Process

Title: "Configuration Loading and Distribution Flow"


Sources: areal/api/cli_args.py12-17 docs/en/cli_reference.md7-21

Configuration Class Hierarchy

AReaL's configuration system is organized into a hierarchy where experiment-level configurations compose multiple specialized configuration objects.

Top-Level Experiment Configurations

Title: "Experiment Configuration Hierarchy"


Sources: areal/api/cli_args.py2319-2454 docs/en/cli_reference.md93-110

Configuration Category Breakdown

CategoryBase ClassKey Fields
ExperimentBaseExperimentConfigexperiment_name, trial_name, allocation_mode docs/en/cli_reference.md93-110
Training EngineTrainEngineConfigpath, optimizer, mb_spec, fsdp, megatron, archon areal/api/cli_args.py723-826
PPO ActorPPOActorConfigeps_clip, kl_ctl, adv_norm, reward_norm areal/api/cli_args.py2319
Inference EngineInferenceEngineConfigbackend, n_workers, max_head_offpolicyness areal/api/cli_args.py1016-1082
GenerationGenerationHyperparametersn_samples, max_new_tokens, temperature, top_p areal/api/cli_args.py164-210
OptimizationOptimizerConfiglr, weight_decay, lr_scheduler_type areal/api/cli_args.py287-355
NormalizationNormConfigmean_level, std_level, group_size areal/api/cli_args.py42-97
Micro-batchingMicroBatchSpecn_mbs, max_tokens_per_mb, packing_algorithm areal/api/cli_args.py99-161

Sources: areal/api/cli_args.py42-1082 docs/en/cli_reference.md93-110

Configuration Flow Through the System

The following diagram shows how configuration objects are instantiated and passed through AReaL's execution pipeline, associating high-level names with code entities:

Title: "Configuration Distribution and Code Entities"


Sources: docs/en/cli_reference.md7-21 areal/engine/fsdp_engine.py219-240 areal/engine/megatron_engine.py168-190 areal/experimental/engine/archon_engine.py150-200

Configuration Usage in Code

Example: Training Engine Initialization The system uses structured configuration to initialize training backends. For example, FSDPEngine consumes TrainEngineConfig during initialization and sets up distributed meshes based on parallelism settings areal/engine/fsdp_engine.py219-240 Similarly, MegatronEngine initializes its TransformerConfig based on MegatronEngineConfig areal/engine/megatron_engine.py168-190 ArchonEngine also uses TrainEngineConfig to initialize its model spec and parallelism dimensions areal/experimental/engine/archon_engine.py150-200

Key Configuration Methods:

MethodPurposeConfig Used
__post_init__Validate configuration valuesNormConfig, MicroBatchSpec areal/api/cli_args.py80-97 areal/api/cli_args.py141-148
new()Update MB spec fieldsMicroBatchSpec areal/api/cli_args.py149-160

Sources: areal/api/cli_args.py80-160

Configuration Dataclass Structure

Common Dataclass Patterns

All configuration dataclasses in areal/api/cli_args.py follow consistent patterns:

Example: OptimizerConfig The OptimizerConfig demonstrates standard patterns for defining hyperparameters with CLI metadata areal/api/cli_args.py287-355


Key Patterns:

Sources: areal/api/cli_args.py42-97 areal/api/cli_args.py287-355

Training Engine Configuration

The TrainEngineConfig class serves as the unified entry point for all training backends (FSDP, Megatron, Archon) areal/api/cli_args.py723-826

Key Configuration Fields:

FieldTypePurpose
pathstrHuggingFace checkpoint location or local path
optimizerOptimizerConfigOptimization parameters (LR, weight decay, etc.)
mb_specMicroBatchSpecMicro-batch splitting strategy areal/api/cli_args.py99-161
fsdpFSDPEngineConfigFSDP2-specific settings (sharding, offload)
megatronMegatronEngineConfigMegatron-Core specific settings (parallelism)
archonArchonEngineConfigArchon (torch-native) settings for MoE/PP

Sources: areal/api/cli_args.py723-826

Micro-Batching and Sequence Packing

The MicroBatchSpec defines how data is partitioned during training steps. A critical component is the packing_algorithm areal/api/cli_args.py127-140

  • FFD (First Fit Decreasing): The default algorithm for sequence packing areal/api/cli_args.py128
  • KK (Karmarkar-Karp): Recommended for better load balancing across Data Parallel (DP) ranks, especially with variable-length sequences areal/api/cli_args.py133-136

Sources: areal/api/cli_args.py99-161