Gemma4-Gutenberg-31B-Heretic
coder3101/gemma-4-31B-it-heretic finetuned for literary, novelistic prose — a Gemma 4 entry in the Gutenberg series.
This pushes the (already strong) base toward a literary-fiction register: story and interiority over static description, controlled pacing over relentless adjective-stacking, and an active dispreference for "AI slop" phrasing.
Heretic variant
This is the Heretic (abliterated / decensored) base with the same Gutenberg ORPO adapter merged in. The literary voice is identical to the standard Gemma4-Gutenberg-31B — the adapter dominates the prose style regardless of base — but the abliterated base is far less prone to refuse dark or mature creative prompts. Intended for fiction that the standard instruct model balks at. Use responsibly.
Method
ORPO (Odds Ratio Preference Optimization) on the full
schneewolflabs/Athanorlite-DPO
(14,816 preference pairs) — a superset of the Gutenberg "Encore" recipe that
bundles jondurbin/gutenberg-dpo-v0.1, nbeerbower/gutenberg2-dpo,
gutenberg-moderne-dpo, human-writing-dpo, synthetic-fiction-dpo,
Arkhaios-DPO, Purpura-DPO, Schule-DPO,
sam-paech/gutenberg3, plus truthy / physical-reasoning / theory-of-mind
balance sets.
| Method | ORPO, β = 0.1 |
| Adapter | LoRA r=64 (text decoder only), merged to full |
| LR | 5e-5, cosine, 0.05 warmup |
| Epochs | 1 |
| Effective batch | 32 |
| Max length | 2048 |
| Optimizer | paged_adamw_8bit, bf16, grad-checkpointing |
| Hardware | 1× NVIDIA GB10 (DGX Spark, 128 GB unified) |
| Trainer | Merlina (grimoire ORPO) |
Training trajectory (clean convergence over ~3.5 days, 454 steps):
| start | end | |
|---|---|---|
| eval/loss | 2.4445 | 2.2052 |
| reward_accuracy | 0.175 | 0.9125 |
| reward_margin | −0.076 | +0.455 |
The reward_accuracy arc (0.18 → 0.50 → 0.91) reflects the model learning to
prefer the literary chosen text while actively suppressing the rejected
slop — the intended Gutenberg dynamic.
Notes
- Gemma 4 31B is a unified multimodal model; the vision/audio towers are left frozen and intact, so this remains a drop-in replacement for the base. Only the text decoder was tuned.
- This is a refinement of an already-capable writer, not a rescue — expect a consistent literary lean rather than a night-and-day transformation.
License
Apache-2.0 (matching the Gemma 4 base). Constituent training datasets carry their own licenses (see the Athanorlite-DPO card).
- Downloads last month
- 7
Model tree for nbeerbower/Gemma4-Gutenberg-31B-Heretic
Base model
google/gemma-4-31B