Gemma 4 E4B — TIES merge of three instruct-subspace-masked adapters

This is a TIES merge of three independently-trained LoRA adapters for google/gemma-4-E4B-it, all of which were trained with an instruct-subspace gradient mask designed to prevent the fine-tune from overwriting the directions in weight space that carry instruction-following behavior.

The point of the merge is to combine multiple style/domain signals (conversational marvin SFT + marvin CPT + spring-dragon adventure CPT) into a single adapter-on-base that retains instruct-following performance.

Components

All three adapters were trained on google/gemma-4-E4B-it at LoRA r=64, α=16 with the instruct-subspace mask at rank 64 (mask rank = LoRA rank).

Slot	Adapter	Source corpus	Samples	Tokens (1 ep)	Recipe
1	T2 INSTRUCT r=64	marvin (instruct form) — back-translated instruct pairs over the marvin corpus: a synthetic "write a passage about X" instruction generated for each real marvin passage, then paired as instruction → passage	4439	22.27M	1k ctx, 2 ep, lr 5e-5
2	T2 CPT-marvin r=64	marvin (raw) — a curated collection of genre-fiction novel prose, used as continued-pretraining (no chat template)	5701	33.83M	1k ctx, 2 ep, lr 5e-5
3	SD CPT r=64	Spring Dragon — old AI-Dungeon-era text-adventure transcripts (`> you do x` second-person interactive fiction), as CPT	2325	5.68M	6k ctx, 2 ep, lr 5e-5

Total tokens seen across all three components (counting 2 epochs each): ~123M.

Merge method

Standard TIES merge via PEFT's add_weighted_adapter with:

combination_type = "ties"
density = 0.2 # keep top 20% magnitude per parameter, zero the rest
weights = [1.0, 1.0, 1.0] # equal weighting

TIES (Yadav et al., 2023) reduces destructive interference between adapters by:

Trimming — keep only the top-density% magnitude weights per parameter, zero the rest
Sign resolution — for each parameter, pick the dominant sign across adapters and zero contributions of the opposite sign
Disjoint-mean merge — average only the surviving weights

The merged adapter was applied to base with merge_and_unload() and the result is the full safetensors model published here.

Instruct-subspace mask (what the components share)

Each component adapter was trained with a per-step gradient projection that strips out the directions of weight movement that overlap with the instruct-vs-base subspace. The subspace is computed once per linear weight matrix W via SVD:

U_k, V_k = top-k SVD of (W_instruct − W_pretrained)
g_safe = g − U_k (U_k.ᵀ g) − (g V_k.ᵀ) V_k # for each parameter gradient g

This is applied as a post_accumulate_grad_hook on every LoRA target weight before the optimizer step. The intuition: directions that distinguish E4B-it from E4B-pt carry instruction-following behavior; gradient components along those directions are dropped, so the fine-tune is constrained to learn style in the orthogonal complement of the instruct subspace.

Subspace-mask IFEval recovery (component-level results)

Across all three components, the mask preserves a large fraction of base-it IFEval performance that the unmasked recipe would have lost:

Recipe	Unmasked drop (pp)	Masked drop (pp)	Recovery
T2 INSTRUCT r=64	-14.97	-7.76	48%
T2 CPT-marvin r=64	-14.42	-8.50	41%
SD CPT r=64	-12.20	-6.84	44%

Base google/gemma-4-E4B-it strict-prompt IFEval = 83.36.

Merge-level evaluation

Variant	IFEval Δpp ↓	V3 humanness h_delta ↑*	Δh_delta vs base ↑	q_prob ↑	dialogue_frac ↑	slop/1k ↓	rep3g/1k ↓
base-it	0.00 (anchor)	-16.74	0.00 (anchor)	0.098	0.42	19.94	7.39
SD masked r=64	-6.84	-14.69	+2.05	0.121	0.42	17.77	12.17
SD masked r=256	-2.03	-15.46	+1.28	0.098	0.31	16.38	11.18
T2 INSTRUCT masked r=64	-7.76	-12.83	+3.91	0.112	0.41	13.96	21.05
TIES (this model)	-0.92	-13.10	+3.64	0.343	0.51	16.12	11.58

Metric definitions

These come from the GRPO-prototype eval pipeline (grpo_prototype_v12d.py), run over 10 short-story writing prompts at temp=1, then averaged per variant:

h_delta — output of the V3 humanness classifier (logit of marvin-real minus logit of gemma-emulation, per sample, then mean). Higher = output looks more like a real book passage from marvin than like Gemma 4 31B-it's same-prompt attempt. Always negative for any gemma-family model because the classifier is gemma-aware; the direction of movement vs base is the signal.
Δh_delta vs base — the movement in h_delta vs base E4B-it. Positive = the fine-tune is shifting outputs further from gemma's default and toward marvin-like prose.
q_prob — P(chosen) from a separate binary quality classifier (a ModernBERT-base fine-tuned on ~1500 DPO pairs from the marvin training pipeline, where "chosen" was the preferred response and "rejected" the worse one). Higher = the output is more like the kind of response a marvin-pipeline reward model would prefer over an alternative. Note this is independent of h_delta — h_delta asks "is this a real marvin passage or gemma's emulation?" while q_prob asks "would the marvin reward model prefer this over a typical bad output?" The two can move independently.
dialogue_frac — fraction of generated lines that contain dialogue (heuristic: presence of paired quote characters with a speaker-attribution shape nearby). Marvin/genre-fiction prose has a stable rate; gemma's default is to skip dialogue, so a rising fraction means the model is producing more novel-like scene structure rather than essay-summary structure.
slop/1k — count per 1000 tokens of n-grams from a curated AI-slop n-gram list (the "barely a whisper", "in a voice that was barely above a whisper", "the air was thick with", etc., compiled from observed LLM tics). Lower = less AI-ese.
rep3g/1k — count per 1000 tokens of repeated 3-grams within the sample (each 3-gram that occurs ≥2× contributes its repeat count). Lower = less local repetition. Gemma-family models can drift into repetitive loops on long generations; this catches that.

* The V3 humanness classifier is trained to discriminate real marvin passages (the curated genre-fiction corpus, see Components) from gemma's best attempt to emulate those passages given the same writing prompt. The "gemma side" was specifically generated with Gemma 4 31B-it (Q4_K_M GGUF, temp=1, no top-p/k/rep-pen). This model (E4B) is a different size than the classifier's reference gemma, but E4B is gemma-family and behaves similarly enough that V3 is still measuring the right axis — a prior version of the classifier built against a non-gemma AI side gave weirdly uninformative scores on gemma fine-tunes; V3 fixes that. So h_delta reads as: "is this output more like an actual book passage from the marvin corpus, or more like gemma's slop version of one?" Higher = more book-like. The single-adapter components score around -12 to -15. The TIES merge scoring -13.10 is in the same range as the singles, which means it is not just averaging back toward base; it is preserving the style signal of the components.

Most striking single number: TIES q_prob = 0.343 vs ~0.10 for any single adapter and base — the V3 classifier's "marvin-style" head fires ~3× more strongly on TIES outputs than on any constituent. Combined with dialogue_frac jumping to 0.51 (vs base 0.42, singles 0.41-0.44) this suggests genuine emergent style mixing rather than cancellation.

Caveats

V3 humanness classifier is task-specific, NOT a general AI-vs-human detector. It is binary: real marvin-corpus book prose vs. gemma's same-prompt output. So a high h_delta only tells you the model's output looks more like a real book passage than gemma's default style — it does not generalize to "more human." Direction-of-movement from base is the load-bearing signal.
IFEval was run on the merged adapter applied to base-it; the -0.92pp drop is the merged-model drop, not the sum of component drops.
No SFT data leak control: each component adapter had access to its own training corpus. There is no held-out evaluation set for component-level style metrics beyond the standard IFEval prompts.
TIES density=0.2 was a single shot, not tuned. Lower/higher density values were not swept.
The three components were chosen because they were what existed at the time (the same r=64 mask rank); this is not a designed-for-merge selection.

Files

Path	Size	Description
`model.safetensors`	15.9 GB	Full bf16 merged weights (base + applied TIES adapter)
`config.json`	—	HF config
`tokenizer.json`, `tokenizer_config.json`, `chat_template.jinja`	—	Gemma 4 tokenizer + chat template

A bf16.gguf is also published at ToastyPigeon/Gemma4-Test-GGUFs/Gemma4-E4B-TIES-3mask-r64.bf16.gguf for llama.cpp use.

Use

Standard AutoModelForCausalLM.from_pretrained(...) works; this is a normal merged Gemma 4 E4B-it model. Use the same chat template as base google/gemma-4-E4B-it. Sampling: temperature=1.0, no top-p / top-k / rep-penalty (gemma-4 defaults).

Provenance

Trained and merged on a Threadripper 3960X + 2× RTX 3090 box, May 2026
Mask SVD computed once from gemma-4-E4B-it − gemma-4-E4B-pt at rank 64/256
Author: ToastyPigeon

Citation

If you use this technique:

@misc{toastypigeon2026instructmask,
 author = {ToastyPigeon},
 title = {Instruct-subspace gradient mask for style fine-tunes of instruction-tuned models},
 year = {2026},
 note = {Method: per-parameter gradient projection orthogonal to the top-k SVD of (W_instruct - W_pretrained)}
}

TIES reference:

@article{yadav2023ties,
 title = {TIES-Merging: Resolving Interference When Merging Models},
 author = {Yadav, Prateek and Tam, Derek and Choshen, Leshem and Raffel, Colin and Bansal, Mohit},
 journal = {NeurIPS},
 year = {2023}
}

Downloads last month: 8

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for ToastyPigeon/Gemma4-E4B-TIES-3mask-r64

Base model

google/gemma-4-E4B

Finetuned

google/gemma-4-E4B-it

Adapter

(117)

this model

Adapters

2 models

URL: https://huggingface.co/ToastyPigeon/Gemma4-E4B-TIES-3mask-r64

⇱ ToastyPigeon/Gemma4-E4B-TIES-3mask-r64 · Hugging Face