👁 A beautiful witch writing a book with a quill
_{Image by CalamitousFelicitouness}

Gemma-2-9B Sugarquill v0

An experimental continued pretrain of Gemma-2-9B-It-SPPO-Iter3 on assorted short story data from the web. I was trying to diversify Gemma's prose, without completely destroying it's smarts. I think I half-succeeded? This model could have used another epoch of training, but even this is already more creative and descriptive than it's base model, w/o becoming too silly. Doesn't seem to have degraded much in terms of core abilities as well. Should be usable both for RP and raw completion storywriting. I originally planned to use this in a merge, but I feel like this model is interesting enough to be released on it's own as well.

Model was trained by Auri.

Dedicated to Cahvay, who wanted a Gemma finetune from me for months by now, and to La Rata, who loves storywriter models.

GGUFs by Prodeus: https://huggingface.co/allura-org/G2-9B-Sugarquill-v0-GGUF

Training notes

This model was trained for 2 epochs on 10k rows (~18.7M tokens), taken equally from Erebus-87k and r_shortstories_24k datasets. It was trained on 8xH100 SXM node for 30 minutes with rsLoRA. I got complete nonsense reported to my wandb during this run, and logging stopped altogether after step 13 for some reason. Seems to be directly related to Gemma, as my training setup worked flawlessly for Qwen. Thanks to Kearm for helping with setting up LF on that node and to Featherless for providing it for EVA-Qwen2.5 (and this model, unknowingly lol) training.

Format

Model responds to Gemma instruct formatting, exactly like it's base model.

<bos><start_of_turn>user
{user message}<end_of_turn>
<start_of_turn>model
{response}<end_of_turn><eos>

Training config

Downloads last month: 23

Safetensors

Model size

9B params

Tensor type

BF16

Model tree for allura-org/G2-9B-Sugarquill-v0

Base model

UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3

Finetuned

(8)

this model

Merges

6 models

Quantizations

5 models

URL: https://huggingface.co/allura-org/G2-9B-Sugarquill-v0

⇱ allura-org/G2-9B-Sugarquill-v0 · Hugging Face

Gemma-2-9B Sugarquill v0

Model tree for allura-org/G2-9B-Sugarquill-v0

Datasets used to train allura-org/G2-9B-Sugarquill-v0