VOOZH about

URL: https://deepwiki.com/inclusionAI/AReaL/6.6-proxy-server-architecture

⇱ Proxy Server Architecture | inclusionAI/AReaL | DeepWiki


Loading...
Last indexed: 7 May 2026 (2e12c1)
Menu

Proxy Server Architecture

Purpose and Scope

This page documents the proxy server architecture that exposes AReaL as an OpenAI-compatible service. This system allows external agent frameworks (e.g., ZeroClaw OpenClaw OpenAI Agents SDK, LangChain, Anthropic SDK, and Tau2-Bench) to interact with AReaL's models using standard HTTP protocols while capturing token-level log-probabilities and rewards required for RL training docs/en/tutorial/online_proxy.md1-10

The architecture supports several execution modes, including a scalable experimental agent_service and the standard proxy gateway for external agent runtimes docs/en/tutorial/online_proxy.md11-20

Sources: areal/experimental/openai/proxy/proxy_rollout_server.py1-57 docs/en/tutorial/online_proxy.md1-20

Architecture Overview

The proxy system is organized into a tiered hierarchy to facilitate distributed inference and session management docs/en/tutorial/online_proxy.md27-62

  1. Proxy Gateway: A stateless FastAPI router that acts as the entry point for external applications. It handles load balancing and routes requests to specific backend workers docs/en/tutorial/online_proxy.md67-69
  2. Proxy Workers (ProxyRolloutServer): Backend servers typically colocated with rollout workers. Each worker manages its own _session_cache, records token-level data, and interfaces with the inference engines areal/experimental/openai/proxy/proxy_rollout_server.py92-114
  3. Inference Engines: Backend servers (SGLang or vLLM) that perform the actual LLM inference docs/en/tutorial/online_proxy.md73

System Component Diagram

This diagram maps the high-level request flow to specific code entities and data structures within the proxy environment.


Sources: areal/experimental/openai/proxy/proxy_rollout_server.py92-195 docs/en/tutorial/online_proxy.md27-62

Proxy Worker Implementation

The ProxyRolloutServer handles the translation between public HTTP APIs and AReaL's internal inference logic.

Session Lifecycle and Authentication

The proxy uses a two-tier authentication system to isolate trajectories docs/en/tutorial/online_proxy.md201-208:

Request Handling Flow

StepComponentDescription
Authentication_require_session_keyExtracts the Bearer token and resolves it to a session_id from _api_key_to_session areal/experimental/openai/proxy/proxy_rollout_server.py187-195
AdaptationAnthropicAdapterConverts Anthropic Messages API requests into OpenAI format for cross-SDK compatibility areal/experimental/openai/proxy/proxy_rollout_server.py126-128
Inference_openai_clientInvokes the ArealOpenAI client, which records log-probabilities into the InteractionCache areal/experimental/openai/proxy/proxy_rollout_server.py96
StreamingStreamingResponseIf stream=True, the server yields Server-Sent Events (SSE) chunks to the client tests/experimental/openai/test_streaming_chat_completions.py136-177

Sources: areal/experimental/openai/proxy/proxy_rollout_server.py185-318 tests/experimental/openai/test_streaming_chat_completions.py132-205

Data Flow: Trajectory Collection and Export

The proxy worker acts as a stateful buffer during the rollout phase of an RL iteration, using SessionData to track progress.


Sources: areal/experimental/openai/proxy/proxy_rollout_server.py92-105 areal/experimental/openai/proxy/server.py40-57

Reward Assignment and Discounting

The proxy allows for flexible reward assignment during agent execution:

  • Asynchronous Rewards: Rewards can be set via POST /rl/set_reward. If an interaction_id is omitted, the reward applies the value to the most recent completion in that session docs/en/tutorial/online_proxy.md261-274
  • Session Management: SessionData in areal/experimental/openai/proxy/server.py tracks the interactions and cumulative rewards for a specific task instance areal/experimental/openai/proxy/server.py52

Sources: areal/experimental/openai/proxy/proxy_rollout_server.py48 docs/en/tutorial/online_proxy.md162-170

Tool Call Integration

The proxy system is designed to handle complex agentic workflows, including those requiring tool usage.

Claude Agent SDK Integration

The MathToolAgent demonstrates advanced tool integration using the ClaudeSDKClient. It defines mathematical tools (add, subtract, multiply, etc.) using the @tool decorator and hosts them via an MCP (Model Context Protocol) server areal/workflow/anthropic/claude_math_agent.py26-87


Sources: areal/workflow/anthropic/claude_math_agent.py90-162 areal/experimental/openai/proxy/proxy_rollout_server.py128

Agent Integration Examples

AReaL provides several implementations demonstrating how to bridge external agent SDKs with the proxy.

FrameworkImplementationDescription
Claude Agent SDKMathToolAgentMulti-turn math reasoning with calculator tools and MCP areal/workflow/anthropic/claude_math_agent.py90-96
Anthropic SDKMathAgentStandard prompt-based math reasoning areal/workflow/anthropic/math_agent.py
OpenAI SDKArealOpenAINative client for reward tracking and interaction caching areal/experimental/openai/client.py

MathToolAgent Implementation Details

The MathToolAgent configures ClaudeAgentOptions with a system prompt and allowed tools. It handles the interactive loop via client.receive_response(), accumulating AssistantMessage content while the SDK handles tool execution in the background areal/workflow/anthropic/claude_math_agent.py139-158 Rewards are computed at the end of the trajectory using math_reward_fn and wrapped in an AsyncRewardWrapper areal/workflow/anthropic/claude_math_agent.py160-162

Sources: areal/workflow/anthropic/claude_math_agent.py1-163 areal/experimental/openai/proxy/proxy_rollout_server.py1-57