VOOZH about

URL: https://huggingface.co/ToastyPigeon/Gemma4-E4B-TIES-3mask-r64

⇱ ToastyPigeon/Gemma4-E4B-TIES-3mask-r64 · Hugging Face


Gemma 4 E4B — TIES merge of three instruct-subspace-masked adapters

This is a TIES merge of three independently-trained LoRA adapters for google/gemma-4-E4B-it, all of which were trained with an instruct-subspace gradient mask designed to prevent the fine-tune from overwriting the directions in weight space that carry instruction-following behavior.

The point of the merge is to combine multiple style/domain signals (conversational marvin SFT + marvin CPT + spring-dragon adventure CPT) into a single adapter-on-base that retains instruct-following performance.

Components

All three adapters were trained on google/gemma-4-E4B-it at LoRA r=64, α=16 with the instruct-subspace mask at rank 64 (mask rank = LoRA rank).

Slot Adapter Source corpus Samples Tokens (1 ep) Recipe
1 T2 INSTRUCT r=64 marvin (instruct form) — back-translated instruct pairs over the marvin corpus: a synthetic "write a passage about X" instruction generated for each real marvin passage, then paired as instruction → passage 4439 22.27M 1k ctx, 2 ep, lr 5e-5
2 T2 CPT-marvin r=64 marvin (raw) — a curated collection of genre-fiction novel prose, used as continued-pretraining (no chat template) 5701 33.83M 1k ctx, 2 ep, lr 5e-5
3 SD CPT r=64 Spring Dragon — old AI-Dungeon-era text-adventure transcripts (> you do x second-person interactive fiction), as CPT 2325 5.68M 6k ctx, 2 ep, lr 5e-5

Total tokens seen across all three components (counting 2 epochs each): ~123M.

Merge method

Standard TIES merge via PEFT's add_weighted_adapter with:

combination_type = "ties"
density = 0.2 # keep top 20% magnitude per parameter, zero the rest
weights = [1.0, 1.0, 1.0] # equal weighting

TIES (Yadav et al., 2023) reduces destructive interference between adapters by:

  1. Trimming — keep only the top-density% magnitude weights per parameter, zero the rest
  2. Sign resolution — for each parameter, pick the dominant sign across adapters and zero contributions of the opposite sign
  3. Disjoint-mean merge — average only the surviving weights

The merged adapter was applied to base with merge_and_unload() and the result is the full safetensors model published here.

Instruct-subspace mask (what the components share)

Each component adapter was trained with a per-step gradient projection that strips out the directions of weight movement that overlap with the instruct-vs-base subspace. The subspace is computed once per linear weight matrix W via SVD:

U_k, V_k = top-k SVD of (W_instruct − W_pretrained)
g_safe = g − U_k (U_k.ᵀ g) − (g V_k.ᵀ) V_k # for each parameter gradient g

This is applied as a post_accumulate_grad_hook on every LoRA target weight before the optimizer step. The intuition: directions that distinguish E4B-it from E4B-pt carry instruction-following behavior; gradient components along those directions are dropped, so the fine-tune is constrained to learn style in the orthogonal complement of the instruct subspace.

Subspace-mask IFEval recovery (component-level results)

Across all three components, the mask preserves a large fraction of base-it IFEval performance that the unmasked recipe would have lost:

Recipe Unmasked drop (pp) Masked drop (pp) Recovery
T2 INSTRUCT r=64 -14.97 -7.76 48%
T2 CPT-marvin r=64 -14.42 -8.50 41%
SD CPT r=64 -12.20 -6.84 44%

Base google/gemma-4-E4B-it strict-prompt IFEval = 83.36.

Merge-level evaluation

Variant IFEval Δpp ↓ V3 humanness h_delta ↑* Δh_delta vs base ↑ q_prob ↑ dialogue_frac ↑ slop/1k ↓ rep3g/1k ↓
base-it 0.00 (anchor) -16.74 0.00 (anchor) 0.098 0.42 19.94 7.39
SD masked r=64 -6.84 -14.69 +2.05 0.121 0.42 17.77 12.17
SD masked r=256 -2.03 -15.46 +1.28 0.098 0.31 16.38 11.18
T2 INSTRUCT masked r=64 -7.76 -12.83 +3.91 0.112 0.41 13.96 21.05
TIES (this model) -0.92 -13.10 +3.64 0.343 0.51 16.12 11.58

Metric definitions

These come from the GRPO-prototype eval pipeline (grpo_prototype_v12d.py), run over 10 short-story writing prompts at temp=1, then averaged per variant:

  • h_delta — output of the V3 humanness classifier (logit of marvin-real minus logit of gemma-emulation, per sample, then mean). Higher = output looks more like a real book passage from marvin than like Gemma 4 31B-it's same-prompt attempt. Always negative for any gemma-family model because the classifier is gemma-aware; the direction of movement vs base is the signal.
  • Δh_delta vs base — the movement in h_delta vs base E4B-it. Positive = the fine-tune is shifting outputs further from gemma's default and toward marvin-like prose.
  • q_prob — P(chosen) from a separate binary quality classifier (a ModernBERT-base fine-tuned on ~1500 DPO pairs from the marvin training pipeline, where "chosen" was the preferred response and "rejected" the worse one). Higher = the output is more like the kind of response a marvin-pipeline reward model would prefer over an alternative. Note this is independent of h_delta — h_delta asks "is this a real marvin passage or gemma's emulation?" while q_prob asks "would the marvin reward model prefer this over a typical bad output?" The two can move independently.
  • dialogue_frac — fraction of generated lines that contain dialogue (heuristic: presence of paired quote characters with a speaker-attribution shape nearby). Marvin/genre-fiction prose has a stable rate; gemma's default is to skip dialogue, so a rising fraction means the model is producing more novel-like scene structure rather than essay-summary structure.
  • slop/1k — count per 1000 tokens of n-grams from a curated AI-slop n-gram list (the "barely a whisper", "in a voice that was barely above a whisper", "the air was thick with", etc., compiled from observed LLM tics). Lower = less AI-ese.
  • rep3g/1k — count per 1000 tokens of repeated 3-grams within the sample (each 3-gram that occurs ≥2× contributes its repeat count). Lower = less local repetition. Gemma-family models can drift into repetitive loops on long generations; this catches that.

* The V3 humanness classifier is trained to discriminate real marvin passages (the curated genre-fiction corpus, see Components) from gemma's best attempt to emulate those passages given the same writing prompt. The "gemma side" was specifically generated with Gemma 4 31B-it (Q4_K_M GGUF, temp=1, no top-p/k/rep-pen). This model (E4B) is a different size than the classifier's reference gemma, but E4B is gemma-family and behaves similarly enough that V3 is still measuring the right axis — a prior version of the classifier built against a non-gemma AI side gave weirdly uninformative scores on gemma fine-tunes; V3 fixes that. So h_delta reads as: "is this output more like an actual book passage from the marvin corpus, or more like gemma's slop version of one?" Higher = more book-like. The single-adapter components score around -12 to -15. The TIES merge scoring -13.10 is in the same range as the singles, which means it is not just averaging back toward base; it is preserving the style signal of the components.

Most striking single number: TIES q_prob = 0.343 vs ~0.10 for any single adapter and base — the V3 classifier's "marvin-style" head fires ~3× more strongly on TIES outputs than on any constituent. Combined with dialogue_frac jumping to 0.51 (vs base 0.42, singles 0.41-0.44) this suggests genuine emergent style mixing rather than cancellation.

Caveats

  • V3 humanness classifier is task-specific, NOT a general AI-vs-human detector. It is binary: real marvin-corpus book prose vs. gemma's same-prompt output. So a high h_delta only tells you the model's output looks more like a real book passage than gemma's default style — it does not generalize to "more human." Direction-of-movement from base is the load-bearing signal.
  • IFEval was run on the merged adapter applied to base-it; the -0.92pp drop is the merged-model drop, not the sum of component drops.
  • No SFT data leak control: each component adapter had access to its own training corpus. There is no held-out evaluation set for component-level style metrics beyond the standard IFEval prompts.
  • TIES density=0.2 was a single shot, not tuned. Lower/higher density values were not swept.
  • The three components were chosen because they were what existed at the time (the same r=64 mask rank); this is not a designed-for-merge selection.

Files

Path Size Description
model.safetensors 15.9 GB Full bf16 merged weights (base + applied TIES adapter)
config.json HF config
tokenizer.json, tokenizer_config.json, chat_template.jinja Gemma 4 tokenizer + chat template

A bf16.gguf is also published at ToastyPigeon/Gemma4-Test-GGUFs/Gemma4-E4B-TIES-3mask-r64.bf16.gguf for llama.cpp use.

Use

Standard AutoModelForCausalLM.from_pretrained(...) works; this is a normal merged Gemma 4 E4B-it model. Use the same chat template as base google/gemma-4-E4B-it. Sampling: temperature=1.0, no top-p / top-k / rep-penalty (gemma-4 defaults).

Provenance

  • Trained and merged on a Threadripper 3960X + 2× RTX 3090 box, May 2026
  • Mask SVD computed once from gemma-4-E4B-it − gemma-4-E4B-pt at rank 64/256
  • Author: ToastyPigeon

Citation

If you use this technique:

@misc{toastypigeon2026instructmask,
 author = {ToastyPigeon},
 title = {Instruct-subspace gradient mask for style fine-tunes of instruction-tuned models},
 year = {2026},
 note = {Method: per-parameter gradient projection orthogonal to the top-k SVD of (W_instruct - W_pretrained)}
}

TIES reference:

@article{yadav2023ties,
 title = {TIES-Merging: Resolving Interference When Merging Models},
 author = {Yadav, Prateek and Tam, Derek and Choshen, Leshem and Raffel, Colin and Bansal, Mohit},
 journal = {NeurIPS},
 year = {2023}
}
Downloads last month
8
Safetensors
Model size
8B params
Tensor type
BF16
·

Model tree for ToastyPigeon/Gemma4-E4B-TIES-3mask-r64

Adapter
(117)
this model
Adapters
2 models