VOOZH about

URL: https://deepwiki.com/inclusionAI/AReaL/6.8-interaction-export

⇱ Interaction Export | inclusionAI/AReaL | DeepWiki


Loading...
Last indexed: 7 May 2026 (2e12c1)
Menu

Interaction Export

This document describes the interaction export system in AReaL's OpenAI-compatible client, which enables training data collection from multi-turn conversations. After an agent workflow completes, interactions stored in the InteractionCache can be exported in two formats: 'concat' (tree-based, returning only leaf conversations) or 'individual' (flat, returning all interactions). This system is essential for converting agent interaction traces into training trajectories for reinforcement learning.

For information about how interactions are created and tracked during agent execution, see 6.3 Interaction Cache and Session Tracking For details on reward assignment before export, see 6.5 Reward Assignment and Discounting

Export Styles Overview

The InteractionCache.export_interactions() method areal/experimental/openai/cache.py162-164 supports two distinct export styles that serve different training scenarios:

StyleDescriptionUse CaseOutput
'concat'Tree-based export returning only leaf nodesMulti-turn conversations where full dialogue history is neededComplete conversation paths from root to leaf
'individual'Flat export returning all interactionsSingle-turn training or when each interaction is independentAll cached interactions without tree structure

The export style determines how parent-child relationships are utilized and which interactions are included in the final output.

Sources: areal/experimental/openai/cache.py162-250

Concat (Tree) Format

Purpose and Structure

The 'concat' export style constructs a conversation tree by leveraging parent-child relationships established during interaction caching. It returns only leaf nodes—interactions that have no children—representing complete conversation paths from root to leaf areal/experimental/openai/cache.py238-246

Title: "Conversation Tree Structure"


Diagram: Conversation Tree Structure - Only leaves (C and D) are exported in 'concat' style

In this example, only Interactions C and D would be exported because they are leaf nodes. Each leaf contains the full conversation history through its token concatenation with parent tokens.

Token Concatenation Mechanism

When chat_template_type='concat', interactions store token IDs by concatenating parent tokens with new tokens. The InteractionWithTokenLogpReward.to_tensor_dict() method implements this logic areal/experimental/openai/types.py143-193:

Title: "Token Concatenation Process"


Diagram: Token Concatenation Process for Building Conversation History

The system aligns the child's input tokens with the parent's full sequence (input + output). If the child's input length is greater than the parent's total length, it concatenates the sequences while maintaining the loss_mask (masking prompts, unmasking completions) and versions tracking areal/experimental/openai/types.py156-171

Sources: areal/experimental/openai/types.py143-193 areal/experimental/openai/client.py143-210

Export Process for Concat Style

The export process for 'concat' style involves several steps:

Title: "Concat Export Flowchart"


Diagram: Concat Export Flowchart

The validation step ensures all interactions use chat_template_type='concat' areal/experimental/openai/cache.py228-232 because token alignment across the tree requires consistent tokenization. With other chat template types, tokens might be added or removed, breaking the tree structure.

Sources: areal/experimental/openai/cache.py223-246

Individual (Flat) Format

Purpose and Usage

The 'individual' export style returns all cached interactions without constructing the conversation tree areal/experimental/openai/cache.py247-250 Each interaction is treated independently, making this format suitable for:

  • Single-turn training scenarios.
  • Reward shaping where each step receives independent rewards.
  • Debugging and inspection of all interactions.

Title: "Individual Export Returns All Interactions"


Diagram: Individual Export Returns All Interactions

Unlike 'concat' style, 'individual' does not filter based on parent-child relationships. All complete interactions are returned areal/experimental/openai/cache.py250

Sources: areal/experimental/openai/cache.py247-250

Parent-Child Relationship Building

Automatic Relationship Construction

Parent-child relationships are established automatically when interactions are added to the cache via InteractionCache.__setitem__() areal/experimental/openai/cache.py107-112 The system uses a longest prefix matching algorithm to find the parent.

Title: "Parent-Child Relationship Building Algorithm"


Diagram: Parent-Child Relationship Building Algorithm

Sources: areal/experimental/openai/cache.py107-160

Prefix Matching Logic

The _is_prefix() helper function areal/experimental/openai/cache.py118-122 determines if one message list is a strict prefix of another. This handles cases where the assistant response is not appended exactly as returned, causing a mismatch in keys. If similarity is detected on the last message without a strict prefix, a warning is logged areal/experimental/openai/cache.py168-186

Sources: areal/experimental/openai/cache.py118-186

Reward Discount and Propagation

Backward Reward Propagation

The apply_reward_discount() method propagates rewards backward through the conversation using geometric discounting areal/experimental/openai/cache.py55-57

Title: "Backward Reward Propagation"


Diagram: Backward Reward Propagation with discount=0.9

The algorithm iterates through interactions in reverse creation order (most recent first) areal/experimental/openai/cache.py89-93 For each interaction, it updates the reward based on the subsequent turn's discounted reward: current_reward = current_reward * turn_discount + interaction.reward areal/experimental/openai/cache.py103-104

Sources: areal/experimental/openai/cache.py55-105

Export Process and Filtering

Incomplete Interaction Filtering

Before export, the system filters out incomplete interactions—those still being processed or missing required fields areal/experimental/openai/cache.py199-221

Title: "Incomplete Interaction Filtering"


Diagram: Incomplete Interaction Filtering Process

Incomplete interactions are excluded from export to prevent training on partial data areal/experimental/openai/cache.py203-211

Sources: areal/experimental/openai/cache.py199-221

Serialization for Proxy Export

When using the AReaL Proxy Server, interactions are serialized into a JSON-compatible format for transport areal/experimental/openai/proxy/server.py129-150

FunctionRoleData Handled
serialize_interactionsPrepares cache for HTTP transportConverts torch.Tensor to lists via to_tensor_dict(), extracts rewards and IDs areal/experimental/openai/proxy/server.py136-150
deserialize_interactionsReconstructs cache from HTTP responseRebuilds InteractionWithTokenLogpReward objects with cached tensor dicts areal/experimental/openai/proxy/server.py153-172

Sources: areal/experimental/openai/proxy/server.py129-172