VOOZH about

URL: https://huggingface.co/di-zhang-fdu/openfugu-conductor-3b

⇱ di-zhang-fdu/openfugu-conductor-3b · Hugging Face


OpenFugu Conductor 3B

A Conductor for the OpenFugu project — an open reimplementation of Sakana AI's Fugu-Ultra orchestration line. Given a user request and a worker pool, the Conductor emits a full agentic workflow (which worker does what, in what order, seeing which prior outputs) that is then executed; the last step's output is the answer.

This checkpoint is a GRPO fine-tune of Llama-3.2-3B-Instruct on nvidia/ToolScale, using a verifiable tool-call reward (the plan's tool-call sequence is scored against the task's evaluation_criteria.actions).

Independent research artifact. Reimplements the mechanism of the Conductor paper (arXiv:2512.04388, Sakana AI); not affiliated with or derived from Sakana's proprietary product. No third-party source code was copied.

Training

Base meta-llama/Llama-3.2-3B-Instruct
Data nvidia/ToolScale (tool-use / orchestration tasks)
Method GRPO (TRL), 8 generations/group, β=0 (no KL — matches the Fugu-Ultra report)
Reward format reward (<think>…</think><answer>[json]</answer>) + action reward (tool-name sequence + argument match vs evaluation_criteria.actions)
Steps 100
Hardware NVIDIA A800-80GB

Reward trajectory (it learns): mean reward climbed 0.70 → 1.70 over training — the format reward saturates to 1.0 within ~3 steps and the action reward rises from 0.14 to a peak of ~0.70, i.e. the model learns to emit the tool-call plan format and to match the ground-truth tool calls.

Output format

The model thinks, then emits a JSON list of tool calls:

<think> brief planning </think>
<answer>[{"name": "<tool>", "arguments": {"<arg>": "<value>"}}, ...]</answer>

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained("di-zhang-fdu/openfugu-conductor-3b")
model = AutoModelForCausalLM.from_pretrained("di-zhang-fdu/openfugu-conductor-3b", torch_dtype="bfloat16")
# system+user prompt format: see train/toolscale_data.py in the OpenFugu repo

Full training/serving code, the TRINITY router line, and the architecture write-up are in the OpenFugu repository: https://github.com/trotsky1997/OpenFugu

License

Llama 3.2 Community License (this is a Llama-3.2-3B-Instruct derivative) — see https://llama.com/llama3_2/license/ . The surrounding OpenFugu code is Apache-2.0.

Downloads last month
58
Safetensors
Model size
3B params
Tensor type
BF16
·

Model tree for di-zhang-fdu/openfugu-conductor-3b

Finetuned
(1673)
this model
Quantizations
1 model

Dataset used to train di-zhang-fdu/openfugu-conductor-3b

Paper for di-zhang-fdu/openfugu-conductor-3b