OpenFugu Conductor 3B
A Conductor for the OpenFugu project — an open reimplementation of Sakana AI's Fugu-Ultra orchestration line. Given a user request and a worker pool, the Conductor emits a full agentic workflow (which worker does what, in what order, seeing which prior outputs) that is then executed; the last step's output is the answer.
This checkpoint is a GRPO fine-tune of Llama-3.2-3B-Instruct on
nvidia/ToolScale, using a
verifiable tool-call reward (the plan's tool-call sequence is scored against the
task's evaluation_criteria.actions).
Independent research artifact. Reimplements the mechanism of the Conductor paper (arXiv:2512.04388, Sakana AI); not affiliated with or derived from Sakana's proprietary product. No third-party source code was copied.
Training
| Base | meta-llama/Llama-3.2-3B-Instruct |
| Data | nvidia/ToolScale (tool-use / orchestration tasks) |
| Method | GRPO (TRL), 8 generations/group, β=0 (no KL — matches the Fugu-Ultra report) |
| Reward | format reward (<think>…</think><answer>[json]</answer>) + action reward (tool-name sequence + argument match vs evaluation_criteria.actions) |
| Steps | 100 |
| Hardware | NVIDIA A800-80GB |
Reward trajectory (it learns): mean reward climbed 0.70 → 1.70 over training — the format reward saturates to 1.0 within ~3 steps and the action reward rises from 0.14 to a peak of ~0.70, i.e. the model learns to emit the tool-call plan format and to match the ground-truth tool calls.
Output format
The model thinks, then emits a JSON list of tool calls:
<think> brief planning </think>
<answer>[{"name": "<tool>", "arguments": {"<arg>": "<value>"}}, ...]</answer>
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained("di-zhang-fdu/openfugu-conductor-3b")
model = AutoModelForCausalLM.from_pretrained("di-zhang-fdu/openfugu-conductor-3b", torch_dtype="bfloat16")
# system+user prompt format: see train/toolscale_data.py in the OpenFugu repo
Full training/serving code, the TRINITY router line, and the architecture write-up are in the OpenFugu repository: https://github.com/trotsky1997/OpenFugu
License
Llama 3.2 Community License (this is a Llama-3.2-3B-Instruct derivative) — see https://llama.com/llama3_2/license/ . The surrounding OpenFugu code is Apache-2.0.
- Downloads last month
- 58
