VOOZH about

URL: https://deepwiki.com/inclusionAI/AReaL/5.2-implementing-custom-workflows

⇱ Implementing Custom Workflows | inclusionAI/AReaL | DeepWiki


Loading...
Last indexed: 7 May 2026 (2e12c1)
Menu

Implementing Custom Workflows

This page provides a practical guide for creating custom RolloutWorkflow implementations in AReaL. For the base API specification, see RolloutWorkflow API (5.1) For built-in workflow implementations, see Built-in Workflows (5.4)

Purpose and Scope

Custom workflows define task-specific data collection logic for reinforcement learning. They control how prompts are formatted, how model responses are generated, how rewards are computed, and how trajectories are formatted for training.

This page covers:

  • The WorkflowLike type and workflow variants (RolloutWorkflow, duck-typed patterns)
  • The duck-typed agent workflow pattern for agentic RL
  • Step-by-step implementation guide for arun_episode
  • Performance tracing within workflows
  • Built-in implementations for vision, multi-turn, and tool-integrated reasoning (TIR) tasks
  • The Scaffolding framework for modular workflow composition

Sources: areal/api/workflow_api.py12-113 areal/workflow/rlvr.py49-178

Workflow Types and the WorkflowLike Protocol

AReaL supports multiple workflow patterns through the WorkflowLike type alias. This allows the system to accept formal class implementations, string import paths, or simple duck-typed objects.

Workflow Type Hierarchy and Code Entities


RolloutWorkflow Pattern

Traditional pattern for explicit trajectory collection:

Agent Workflow Pattern (Duck-Typed)

High-level pattern for agent framework integration (e.g., using OpenAI SDK):

Sources: areal/api/workflow_api.py12-113 areal/workflow/openai/math_agent.py27-48

Implementation Overview

Custom workflow implementation follows a standard pattern of initialization and episode execution. The workflow acts as the orchestrator between the input data and the inference engine.

Workflow Implementation Pattern


Sources: areal/api/workflow_api.py12-37 areal/workflow/rlvr.py139-178

Step-by-Step Implementation Guide

Step 1: Define Workflow Class

Create a class that inherits from RolloutWorkflow areal/api/workflow_api.py12


Step 2: Implement arun_episode

The arun_episode method orchestrates the interaction with the InferenceEngine areal/api/workflow_api.py14-16 Implementations typically decode prompts, call agenerate, and compute rewards areal/workflow/rlvr.py139-162


Sources: areal/api/workflow_api.py12-37 areal/workflow/rlvr.py139-178

Step 3: Format Trajectory Tensors

The workflow must return a dictionary containing specific tensors. Tensors must be unsqueezed to provide a batch dimension areal/workflow/rlvr.py178

KeyDescription
input_idsConcatenated prompt and response tokens areal/workflow/rlvr.py171
logprobsLog probabilities of the generated tokens areal/workflow/rlvr.py173
loss_maskMask indicating which tokens contribute to loss areal/workflow/rlvr.py172
rewardsScalar reward for the episode areal/workflow/rlvr.py176
versionsWeight versions used for each token generation areal/workflow/rlvr.py174
attention_maskBoolean mask for sequence length areal/workflow/rlvr.py175

Sources: areal/workflow/rlvr.py170-178 areal/workflow/vision_rlvr.py154-162

Performance Tracing and Monitoring

Workflows should use the PerfTracer and SessionTracer to provide visibility into execution phases like generation and reward calculation.

Workflow Performance Instrumentation


Sources: areal/workflow/rlvr.py83-137 areal/workflow/vision_rlvr.py45-51

Implementing Duck-Typed Agent Workflows

The duck-typed agent pattern allows integrating external agent frameworks. The workflow class simply needs an async run method areal/api/workflow_api.py75-77


Sources: areal/workflow/openai/math_agent.py27-48 areal/workflow/anthropic/math_agent.py17-81

Specialized Workflow Examples

Vision RLVR Workflow

Extends RLVRWorkflow to handle multi-modal inputs. It uses an AutoProcessor to encode images and text areal/workflow/vision_rlvr.py43 The arun_episode implementation constructs a ModelRequest with image_data areal/workflow/vision_rlvr.py122-132 and returns a multi_modal_input key in the result dictionary areal/workflow/vision_rlvr.py158

Multi-Turn Workflow

Orchestrates multiple attempts at solving a task. If the reward is 0.0, it appends a "retry" prompt (multi_turn_prompt_ids) and calls the engine again until max_turns is reached or a positive reward is achieved areal/workflow/multi_turn.py76-120 It applies a turn_discount to the final reward based on the number of attempts areal/workflow/multi_turn.py120

Tool-Integrated Reasoning (TIR) Workflow

Supports complex multi-turn interactions involving tool calls (e.g., Python execution). It uses a ToolManager to parse tool markers and execute code examples/tir/tir_workflow.py72-83 The arun_episode method runs a _multi_round_response loop that continues until a final answer is detected or the turn limit is reached examples/tir/tir_workflow.py101-143

Scaffolding Framework

A modular framework for composing RL workflows using a Controller/Worker/ScaffoldingLlm pattern examples/scaffolding/workflow.py4-12 It uses an SGLangWorker to call OpenAI-compatible APIs examples/scaffolding/worker.py101-108 and a PipelineTrajectoryMaker to orchestrate generation and reward controllers examples/scaffolding/workflow.py146-148

Sources: areal/workflow/vision_rlvr.py103-162 areal/workflow/multi_turn.py58-135 examples/tir/tir_workflow.py101-190 examples/scaffolding/workflow.py42-152