VOOZH about

URL: https://deepwiki.com/inclusionAI/AReaL/2.2-cli-arguments-reference

⇱ CLI Arguments Reference | inclusionAI/AReaL | DeepWiki


Loading...
Last indexed: 7 May 2026 (2e12c1)
Menu

CLI Arguments Reference

This page provides a comprehensive reference for all configuration parameters available in AReaL's command-line interface. These parameters are defined using Python dataclasses and can be specified in YAML configuration files or overridden via command-line arguments using dot notation.

Purpose and Scope

The AReaL configuration system is designed to be hierarchical and type-safe. This reference covers:

  • Command-line invocation patterns and override syntax.
  • Dataclass hierarchy and how CLI arguments map to internal code entities.
  • Detailed field references for core experiment types, training engines, and inference backends.
  • Validation mechanisms and common configuration scenarios.

CLI Invocation Patterns

Basic Usage

Experiments are typically launched using specific experiment scripts which utilize the areal.api.cli_args utilities to parse configurations. You must provide a base configuration file via the --config flag docs/en/cli_reference.md9-13

Configuration Override Syntax

CLI arguments can override any field in the configuration hierarchy using dot notation. This is processed via hydra and OmegaConf before being instantiated into dataclasses areal/api/cli_args.py14-17


Sources: areal/api/cli_args.py14-17 docs/en/cli_reference.md15-19 docs/generate_cli_docs.py143-183

Dataclass Hierarchy and Mapping

The following diagram bridges the Natural Language CLI Space to the Code Entity Space, showing how arguments are routed to specific classes in areal.api.cli_args.

Configuration Mapping Diagram


Sources: areal/api/cli_args.py42-160 docs/generate_cli_docs.py53-92 docs/en/cli_reference.md93-110

Core Experiment Configurations

The root configuration object is usually a subclass of BaseExperimentConfig docs/en/cli_reference.md93-97

ParameterTypeDefaultDescription
experiment_namestrRequiredName of the experiment (no '_' or '/') docs/en/cli_reference.md101
trial_namestrRequiredName of the trial (no '-' or '/') docs/en/cli_reference.md102
seedint1Random seed for reproducibility docs/en/cli_reference.md105
enable_offloadboolfalseEnable training offload using torch_memory_saver docs/en/cli_reference.md106
total_train_epochsint1Total number of epochs to train docs/en/cli_reference.md107

Sources: areal/api/cli_args.py42-160 docs/en/cli_reference.md99-110

Training Engine Configurations (TrainEngineConfig)

The TrainEngineConfig class is used for actor, critic, and ref (reference) models. It defines the compute backend and parallelism strategy.

Architecture-to-Code Mapping

The following diagram shows how different training backends are configured within the TrainEngineConfig entity and implemented in the engine layer.


Sources: docs/en/cli_reference.md33-42 docs/generate_cli_docs.py60-69 areal/engine/fsdp_engine.py218-220 areal/engine/megatron_engine.py168-170 areal/experimental/engine/archon_engine.py147-152

Optimizer Configuration (OptimizerConfig)

Nested under actor.optimizer.*. This defines the optimization parameters used by training engines.

ParameterTypeDefaultDescription
lrfloatRequiredLearning rate for the optimizer docs/en/cli_reference.md18
weight_decayfloat0.0Weight decay coefficient.
lr_scheduler_typestr"cosine"Type of learning rate scheduler.

Sources: areal/api/cli_args.py42-160 docs/generate_cli_docs.py64

Micro-Batch Specification (MicroBatchSpec)

Nested under actor.mb_spec.*. This controls how the global batch is split for forward/backward passes areal/api/cli_args.py100-101

ParameterTypeDefaultDescription
n_mbsint1Number of micro-batches (minimum if max_tokens_per_mb set) areal/api/cli_args.py103-108
max_tokens_per_mbintNoneMaximum tokens per micro-batch forward pass areal/api/cli_args.py115-120
packing_algorithmstr"ffd"Sequence packing algorithm: 'ffd' (First Fit Decreasing) or 'kk' (Karmarkar-Karp) areal/api/cli_args.py127-139

Sources: areal/api/cli_args.py100-140 areal/api/cli_args.py142-146

Inference and Generation Parameters

Generation Hyperparameters (GenerationHyperparameters)

Nested under gconfig.*. These control the rollout behavior during training areal/api/cli_args.py164-165

ParameterTypeDefaultDescription
n_samplesint1Number of sequences per prompt areal/api/cli_args.py167-169
max_new_tokensint16384Maximum tokens to generate areal/api/cli_args.py170-172
temperaturefloat1.0Sampling temperature areal/api/cli_args.py194-197
top_pfloat1.0Nucleus sampling probability threshold areal/api/cli_args.py186-189
greedyboolfalseUse max probability decoding areal/api/cli_args.py182-185

Sources: areal/api/cli_args.py164-210

LoRA Configurations

LoRA (Low-Rank Adaptation) can be enabled to reduce memory pressure during training.

ParameterTypeDefaultDescription
use_loraboolfalseEnables LoRA fine-tuning mode.
lora_rankint16Rank of the low-rank adapters.
lora_alphaint32LoRA scaling factor.
target_moduleslist["all-linear"]Submodules receiving LoRA adapters.

Sources: areal/engine/megatron_utils/megatron_lora.py73-74 areal/engine/fsdp_engine.py20-25

Logging and Monitoring (StatsLoggerConfig)

The StatsLoggerConfig manages integration with experiment tracking backends.

ParameterTypeDefaultDescription
wandb.modestr"disabled"Weights & Biases mode (online, disabled).
swanlab.modestr"disabled"SwanLab tracking mode.
trackio.modestr"disabled"Trackio tracking mode tests/test_trackio_backend.py17-18
tensorboard.pathstrNonePath for TensorBoard logs.

Sources: docs/generate_cli_docs.py83-92 tests/test_trackio_backend.py6-38

Normalization Settings (NormConfig)

Used for reward and advantage normalization in RL algorithms areal/api/cli_args.py43-44

ParameterTypeDefaultDescription
mean_levelstr"batch"Mean level: batch, group, or None areal/api/cli_args.py46-52
std_levelstr"batch"Std level: batch, group, or None areal/api/cli_args.py57-63
std_unbiasedbooltrueUse unbiased standard deviation areal/api/cli_args.py64-69
epsfloat1e-5Epsilon for numerical stability areal/api/cli_args.py70-75

Sources: areal/api/cli_args.py43-97