VOOZH about

URL: https://huggingface.co/datasets/lordx64/agentic-distill-fable-5-sft

⇱ lordx64/agentic-distill-fable-5-sft · Datasets at Hugging Face


text
stringlengths
673
80.8k
source
stringclasses
1 value
session
stringclasses
1 value
<|im_start|>system You are a helpful AI assistant.<|im_end|> <|im_start|>user USER: Make a new one, it should be a fast paced multiplayer FPS (with optinal bots) with full ray-traced stuff with lots of settings. TL;DR clone CSgo to the best of your ability. Look at the result and refine it USER: USER: <command-name>/mo...
Glint-Research/Fable-5-traces
glint
<|im_start|>system You are a helpful AI assistant.<|im_end|> <|im_start|>user USER: Make a new one, it should be a fast paced multiplayer FPS (with optinal bots) with full ray-traced stuff with lots of settings. TL;DR clone CSgo to the best of your ability. Look at the result and refine it USER: USER: <command-name>/mo...
Glint-Research/Fable-5-traces
glint
<|im_start|>system You are a helpful AI assistant.<|im_end|> <|im_start|>user USER: Make a new one, it should be a fast paced multiplayer FPS (with optinal bots) with full ray-traced stuff with lots of settings. TL;DR clone CSgo to the best of your ability. Look at the result and refine it USER: USER: <command-name>/mo...
Glint-Research/Fable-5-traces
glint
<|im_start|>system You are a helpful AI assistant.<|im_end|> <|im_start|>user USER: Make a new one, it should be a fast paced multiplayer FPS (with optinal bots) with full ray-traced stuff with lots of settings. TL;DR clone CSgo to the best of your ability. Look at the result and refine it USER: USER: <command-name>/mo...
Glint-Research/Fable-5-traces
glint
<|im_start|>system You are a helpful AI assistant.<|im_end|> <|im_start|>user USER: Make a new one, it should be a fast paced multiplayer FPS (with optinal bots) with full ray-traced stuff with lots of settings. TL;DR clone CSgo to the best of your ability. Look at the result and refine it USER: USER: <command-name>/mo...
Glint-Research/Fable-5-traces
glint
<|im_start|>system You are a helpful AI assistant.<|im_end|> <|im_start|>user USER: Make a new one, it should be a fast paced multiplayer FPS (with optinal bots) with full ray-traced stuff with lots of settings. TL;DR clone CSgo to the best of your ability. Look at the result and refine it USER: USER: <command-name>/mo...
Glint-Research/Fable-5-traces
glint
<|im_start|>system You are a helpful AI assistant.<|im_end|> <|im_start|>user USER: Make a new one, it should be a fast paced multiplayer FPS (with optinal bots) with full ray-traced stuff with lots of settings. TL;DR clone CSgo to the best of your ability. Look at the result and refine it USER: USER: <command-name>/mo...
Glint-Research/Fable-5-traces
glint
<|im_start|>system You are a helpful AI assistant.<|im_end|> <|im_start|>user USER: Make a new one, it should be a fast paced multiplayer FPS (with optinal bots) with full ray-traced stuff with lots of settings. TL;DR clone CSgo to the best of your ability. Look at the result and refine it USER: USER: <command-name>/mo...
Glint-Research/Fable-5-traces
glint
<|im_start|>system You are a helpful AI assistant.<|im_end|> <|im_start|>user …[earlier truncated]… -x--- 161 lane lane 20480 Jun 12 14:08 .. drwxrwxr-x 3 lane lane 4096 Jun 12 09:49 archives drwxrwxr-x 86 lane lane 4096 Jun 12 09:52 node_modules -rw-rw-r-- 1 lane lane 287 Jun 12 09:52 package.json -rw-rw-r-- ...
Glint-Research/Fable-5-traces
glint
<|im_start|>system You are a helpful AI assistant.<|im_end|> <|im_start|>user …[earlier truncated]… atform — auth, avatars, multiplayer worlds, studio editor, real-time physics — generated in a single shot. Express backend, Three.js 3D rendering, WebSocket multiplayer. 4 pre-built official games, catalog system, friend...
Glint-Research/Fable-5-traces
glint
<|im_start|>system You are a helpful AI assistant.<|im_end|> <|im_start|>user …[earlier truncated]… 1 const express = require('express'); 2 const path = require('path'); 3 const fs = require('fs'); 4 const { spawn } = require('child_process'); 5 6 const app = express(); 7 const PORT = Number(process.env.PORT) || 4000...
Glint-Research/Fable-5-traces
glint
<|im_start|>system You are a helpful AI assistant.<|im_end|> <|im_start|>user …[earlier truncated]… l ray-traced stuff with lots of settings. TL;DR clone CSgo to the best of your ability. Look at the result and refine it USER: USER: <command-name>/model</command-name> <command-message>model</command-message...
Glint-Research/Fable-5-traces
glint
<|im_start|>system You are a helpful AI assistant.<|im_end|> <|im_start|>user …[earlier truncated]… d-caveat> USER: <command-name>/model</command-name> <command-message>model</command-message> <command-args></command-args> USER: <local-command-stdout>Set model to Fable 5 and saved as your defaul...
Glint-Research/Fable-5-traces
glint
<|im_start|>system You are a helpful AI assistant.<|im_end|> <|im_start|>user …[earlier truncated]… lane lane 4096 Jun 12 09:52 node_modules -rw-rw-r-- 1 lane lane 287 Jun 12 09:52 package.json -rw-rw-r-- 1 lane lane 37037 Jun 12 09:52 package-lock.json drwxrwxr-x 4 lane lane 4096 Jun 12 09:49 public drwxrwx...
Glint-Research/Fable-5-traces
glint
<|im_start|>system You are a helpful AI assistant.<|im_end|> <|im_start|>user …[earlier truncated]… -type f | grep -v node_modules | head -40", "description": "Inspect package.json and file tree"} TOOL RESULT: { "name": "ai-archives", "version": "1.0.0", "description": "AI Archives — only the leaps that matter",...
Glint-Research/Fable-5-traces
glint
End of preview. Expand in Data Studio

Fable-5 SFT — prepared for Qwable fine-tuning

4,659 single-turn pairs from Claude Fable-5 (Anthropic preview model, suspended globally 2026-06-22 under U.S. export-control directives), reformatted into a single-text-column parquet ready for SFTTrainer(dataset_text_field="text") + train_on_responses_only.

Composition:

  • 3,793 rows (81%) end in a <tool_use> block — agentic tool-call patterns
  • 866 rows (19%) end in a pure text response

This is agentic data, not pure reasoning data. The <think> blocks reason about code edits and tool calls, not about abstract problems. The narrow training distribution is essentially one developer''s week of Claude Code sessions plus assorted preview-tool work.

Used to train

🤖 lordx64/Qwable-v1Qwen3.6-35B-A3B warm-started from the Opus 4.7 distill, then SFT''d on this dataset for 2 epochs (582 steps) on a single H200. Final loss 0.7956 (last-20 avg).

GGUF quants: lordx64/Qwable-v1-GGUF (IQ4_XS, Q5_K_M, Q8_0).

What this repo adds on top of upstream

  1. Format: collapsed into a single text column with full Qwen chat template (system + user + <think>…</think> + visible response).
  2. Tool serialization: tool_use blocks rendered as <tool_use name="X" id="Y">…</tool_use>; tool_result rendered as <tool_result id="X">…</tool_result>. Custom XML envelope — NOT Qwen native <tool_call> tokens. Parses with a small regex.
  3. Filtering: dropped Claude Code''s <synthetic> rate-limit injections; dropped assistant turns with no thinking trace; dropped rows where CoT < 50 chars or output failed validation.
  4. Noise stripping: removed Claude Code slash-command meta blocks (<local-command-caveat>, <command-name>, <command-message>, <command-args>, <local-command-stdout>) and ANSI escape codes from user-side content.
  5. Secrets scrubbed: 204 occurrences of 2 active Groq API keys redacted from upstream session JSONLs (Claude Code captured them via Read of .env files in the original developer''s repo).
  6. Dedup: SHA-256 over user-turn content (6 dups removed).
  7. Packaging: single parquet under data/train-*.parquet for HF datasets auto-detection.

The source data and CoT credit go entirely to the upstream creators — we just shaped it for training.

Provenance

Source Rows Chars
Glint-Research/Fable-5-traces 4,659 48,772,554

Total: 48,772,554 chars (~12,193,138 Qwen tokens), avg ~2.6k tokens/row.

Note on the other Fable-5 sources

We initially planned a 3-source merge including armand0e/claude-fable-5-claude-code and victor/fable-5-boeing-747-trace. Both turned out to have 100% of thinking blocks redacted by Anthropic''s API (only the cryptographic signature is preserved; the cleartext thinking field is empty). This is consistent with Anthropic''s preview-model IP protection — useful for trace integrity verification, but not for SFT.

Only Glint-Research/Fable-5-traces ships cleartext CoT (their README notes they "added the CoT data" — likely they had a different API tier or synthesized post-hoc; quality is high-signal reasoning consistent with the final response).

Row format

<|im_start|>system
You are a helpful AI assistant.<|im_end|>
<|im_start|>user
{user_or_tool_result}<|im_end|>
<|im_start|>assistant
<think>
{fable5_thinking}
</think>

{fable5_response_with_tool_calls}<|im_end|>

License & terms

Inherits AGPL-3.0 from upstream Glint-Research/Fable-5-traces. The underlying content is output from Anthropic''s gated claude-fable-5 model (briefly available 2026-06-10 to 2026-06-22) — downstream users must verify compliance with Anthropic''s usage policies for their specific use case.

Downloads last month
128

Models trained or fine-tuned on lordx64/agentic-distill-fable-5-sft