VOOZH about

URL: https://deepwiki.com/inclusionAI/AReaL/2.6-generation-hyperparameters

⇱ Generation Hyperparameters | inclusionAI/AReaL | DeepWiki


Loading...
Last indexed: 7 May 2026 (2e12c1)
Menu

Generation Hyperparameters

Purpose and Scope

This page documents the GenerationHyperparameters configuration class, which controls text generation behavior during inference and rollout phases in AReaL. These hyperparameters determine how the model samples tokens, when to stop generation, and how to handle special tokens.

For information about configuring inference engines themselves (SGLang, vLLM server settings), see Inference Engine Configurations. For overall configuration system concepts, see Configuration Overview.

Overview

The GenerationHyperparameters dataclass areal/api/cli_args.py163-211 encapsulates all parameters that control token generation during inference. It serves three primary purposes:

  1. Training Rollouts: Controls generation during RL training data collection (e.g., in PPOActorConfig or GRPOActorConfig).
  2. Evaluation: Separate hyperparameters can be specified via eval_gconfig for evaluation runs.
  3. Agentic RL: Converts to OpenAI-compatible API formats for agent framework integration via the areal.api.io_struct.ModelRequest areal/api/io_struct.py28-60

The class is instantiated in experiment configurations (PPO, GRPO, SFT) as the gconfig parameter and optionally as eval_gconfig for evaluation-specific settings examples/math/gsm8k_grpo_lora.yaml36-43

Sources: areal/api/cli_args.py163-211 areal/api/io_struct.py28-60 examples/math/gsm8k_grpo_lora.yaml36-43

Class Structure

The following diagram illustrates the GenerationHyperparameters class and its primary methods for interacting with the rest of the system.

Code Entity Space: Generation Configuration


Sources: areal/api/cli_args.py163-211

Core Parameters

Sampling Parameters

These parameters control the stochastic sampling process during token generation areal/api/cli_args.py181-196

ParameterTypeDefaultDescription
temperaturefloat1.0Sampling temperature. Higher values (>1.0) increase diversity; lower values (<1.0) make outputs more deterministic areal/api/cli_args.py193-196
top_pfloat1.0Nucleus sampling threshold. Only considers tokens in the top-p cumulative probability mass areal/api/cli_args.py185-188
top_kint100,000,000Top-K sampling. Only considers the K highest probability tokens areal/api/cli_args.py189-192
greedyboolFalseUse greedy decoding (always select highest probability token). Overrides temperature/top_p/top_k when enabled areal/api/cli_args.py181-184
use_beam_searchboolFalseEnable beam search in vLLM. When enabled, sampling parameters are automatically ignored areal/api/cli_args.py209-211

Sampling Interaction Rules:

  • When greedy=True: all sampling parameters are ignored, output is deterministic.
  • When use_beam_search=True: temperature, top_p, and top_k are ignored (vLLM-specific) areal/api/cli_args.py209-211
  • In SGLangBackend, if greedy=True, temperature is forced to 0.0 in the payload areal/engine/sglang_remote.py60

Sources: areal/api/cli_args.py181-196 areal/api/cli_args.py209-211 areal/engine/sglang_remote.py56-65

Length Control

Parameters that control the number of tokens generated areal/api/cli_args.py166-180

ParameterTypeDefaultDescription
n_samplesint1Number of sequences to generate per prompt. Used for over-generation in RL algorithms like GRPO areal/api/cli_args.py166-168
max_new_tokensint16384Maximum number of NEW tokens to generate (excluding prompt) areal/api/cli_args.py169-171
min_new_tokensint0Minimum number of tokens that must be generated before stopping is allowed areal/api/cli_args.py172-174
max_tokensint32768Maximum total sequence length including prompt and generated tokens areal/api/cli_args.py175-180

Length Limit Precedence:

  1. Generation stops when max_new_tokens is reached.
  2. OR when max_tokens total length is reached.
  3. OR when a stop condition is met (if min_new_tokens is satisfied).

Sources: areal/api/cli_args.py166-180

Stop Conditions

Parameters that determine when generation should terminate areal/api/cli_args.py197-211

ParameterTypeDefaultDescription
stop_token_idslist[int][]Token IDs that trigger generation stop when sampled areal/api/cli_args.py197-200
stoplist[str] | NoneNoneString sequences that trigger stop when sampled areal/api/cli_args.py210-211
ignore_eosboolFalseWhen True, generation continues even when EOS token is sampled areal/api/cli_args.py201-204

Helper Method: The new_with_stop_and_pad_token_ids(tokenizer) method areal/api/cli_args.py223-232 automatically adds the tokenizer's pad_token_id and eos_token_id to stop_token_ids unless ignore_eos=True.

Sources: areal/api/cli_args.py197-211 areal/api/cli_args.py223-232

Additional Parameters

ParameterTypeDefaultDescription
frequency_penaltyfloat0.0Penalizes tokens based on their frequency in the sequence areal/api/cli_args.py210-211
skip_special_tokensboolTrueSkip special tokens when decoding/displaying outputs areal/api/cli_args.py205-208
lora_namestr"default_lora"LoRA adapter name to use for this generation request areal/api/cli_args.py210-211

Sources: areal/api/cli_args.py205-211

Configuration Usage

Programmatic Usage

The new method areal/api/cli_args.py213-221 is used to create updated configurations while maintaining compatibility with the configuration system.


Sources: areal/api/cli_args.py213-221

System Integration Flow

The following diagram maps how the GenerationHyperparameters (Natural Language Space config) flow into the distributed InferenceEngine and TrainEngine implementations (Code Entity Space).

Natural Language Space to Code Entity Space Mapping


Sources: areal/api/cli_args.py163-211 areal/api/io_struct.py28-34 areal/api/workflow_api.py14-39 areal/engine/fsdp_engine.py87 areal/engine/megatron_engine.py84 areal/experimental/engine/archon_engine.py83

OpenAI API Format Conversion

The GenerationHyperparameters class provides methods to convert to OpenAI-compatible API formats areal/api/cli_args.py234-312 This is critical for agentic RL where external tools or frameworks expect standard OpenAI schemas.

Conversion Methods

MethodTarget APIPrimary Key for Tokens
to_openai_completions_args_dict()Chat Completionsmax_completion_tokens areal/api/cli_args.py234-243
to_openai_responses_args_dict()Responses APImax_output_tokens areal/api/cli_args.py245-254
to_openai_agents_model_settings_dict()Agents Model Settingsmax_tokens areal/api/cli_args.py256-265

The underlying to_openai_args_dict(api_format) method areal/api/cli_args.py267-312 handles the translation of AReaL parameters to their respective OpenAI equivalents.

Sources: areal/api/cli_args.py234-312

Unsupported Parameters and Warnings

When converting to OpenAI formats, certain AReaL-specific parameters cannot be represented. The system logs warnings areal/api/cli_args.py289-305 if the following are used:

  • min_new_tokens (OpenAI has no direct equivalent)
  • greedy (Should use temperature=0.0)
  • top_k (Not supported by OpenAI)
  • stop_token_ids (OpenAI uses string stop sequences)
  • ignore_eos
  • lora_name (Passed separately in AReaL)

Sources: areal/api/cli_args.py289-305

Integration with Inference Engines

The hyperparameters are utilized by the DistRolloutCoordinator to configure the remote inference backends via ModelRequest objects areal/api/io_struct.py28-34

Backend-Specific Implementation

Sources: areal/engine/sglang_remote.py40-128 areal/engine/vllm_remote.py41-127 areal/api/io_struct.py28-34