Gemma4-Gutenberg-31B-Heretic

coder3101/gemma-4-31B-it-heretic finetuned for literary, novelistic prose — a Gemma 4 entry in the Gutenberg series.

This pushes the (already strong) base toward a literary-fiction register: story and interiority over static description, controlled pacing over relentless adjective-stacking, and an active dispreference for "AI slop" phrasing.

Heretic variant

This is the Heretic (abliterated / decensored) base with the same Gutenberg ORPO adapter merged in. The literary voice is identical to the standard Gemma4-Gutenberg-31B — the adapter dominates the prose style regardless of base — but the abliterated base is far less prone to refuse dark or mature creative prompts. Intended for fiction that the standard instruct model balks at. Use responsibly.

Method

ORPO (Odds Ratio Preference Optimization) on the full schneewolflabs/Athanorlite-DPO (14,816 preference pairs) — a superset of the Gutenberg "Encore" recipe that bundles jondurbin/gutenberg-dpo-v0.1, nbeerbower/gutenberg2-dpo, gutenberg-moderne-dpo, human-writing-dpo, synthetic-fiction-dpo, Arkhaios-DPO, Purpura-DPO, Schule-DPO, sam-paech/gutenberg3, plus truthy / physical-reasoning / theory-of-mind balance sets.


Method	ORPO, β = 0.1
Adapter	LoRA r=64 (text decoder only), merged to full
LR	5e-5, cosine, 0.05 warmup
Epochs	1
Effective batch	32
Max length	2048
Optimizer	paged_adamw_8bit, bf16, grad-checkpointing
Hardware	1× NVIDIA GB10 (DGX Spark, 128 GB unified)
Trainer	Merlina (grimoire ORPO)

Training trajectory (clean convergence over ~3.5 days, 454 steps):

	start	end
eval/loss	2.4445	2.2052
reward_accuracy	0.175	0.9125
reward_margin	−0.076	+0.455

The reward_accuracy arc (0.18 → 0.50 → 0.91) reflects the model learning to prefer the literary chosen text while actively suppressing the rejected slop — the intended Gutenberg dynamic.

Notes

Gemma 4 31B is a unified multimodal model; the vision/audio towers are left frozen and intact, so this remains a drop-in replacement for the base. Only the text decoder was tuned.
This is a refinement of an already-capable writer, not a rescue — expect a consistent literary lean rather than a night-and-day transformation.

License

Apache-2.0 (matching the Gemma 4 base). Constituent training datasets carry their own licenses (see the Athanorlite-DPO card).

Downloads last month: 7

Safetensors

Model size

31B params

Tensor type

BF16

Model tree for nbeerbower/Gemma4-Gutenberg-31B-Heretic

Base model

google/gemma-4-31B

Finetuned

google/gemma-4-31B-it

Finetuned

coder3101/gemma-4-31B-it-heretic

Finetuned

(2)

this model

Quantizations

5 models

URL: https://huggingface.co/nbeerbower/Gemma4-Gutenberg-31B-Heretic

⇱ nbeerbower/Gemma4-Gutenberg-31B-Heretic · Hugging Face

Gemma4-Gutenberg-31B-Heretic

Heretic variant

Method

Notes

License

Model tree for nbeerbower/Gemma4-Gutenberg-31B-Heretic

Datasets used to train nbeerbower/Gemma4-Gutenberg-31B-Heretic