VOOZH about

URL: https://huggingface.co/ManniX-ITA/Qwen3.5-27B-Omnimerge-v2

⇱ ManniX-ITA/Qwen3.5-27B-Omnimerge-v2 · Hugging Face


Qwen3.5-27B-Omnimerge-v2

An improved 3-way weight-space merge of Qwen3.5-27B reasoning-distilled fine-tunes using the Omnimerge v2 method — combining four recent advances in model merging.

GGUF quantizations available at ManniX-ITA/Qwen3.5-27B-Omnimerge-v2-GGUF

Benchmark Results (Q6_K)

Benchmark Omnimerge v1 Omnimerge v2 Delta
GPQA Diamond (198q, flex) 61.11% 69.19% +8.08 pp
MBPP pass@1 71.80% 74.60% +2.80 pp
HumanEval pass@1 79.88% 79.27% -0.61 pp

vs Best Source Model (Claude-distill)

Benchmark Claude-distill Omnimerge v2 Delta
GPQA Diamond (198q, flex) 53.03% 69.19% +16.16 pp
MBPP pass@1 71.20% 74.60% +3.40 pp
HumanEval pass@1 76.22% 79.27% +3.05 pp

Method: Omnimerge v2

Four enhancements over standard DARE-TIES (v1):

  1. OBIM-lite magnitude masking (based on OBIM, arXiv 2502.12217): Deterministic top-k masking by |delta| magnitude instead of random Bernoulli drop. Keeps the most informative parameter changes.

  2. DAREx rescaling (based on DAREx, arXiv 2410.09344, ICLR 2025): Survivors divided by configurable q instead of density. Lower variance than standard DARE rescaling.

  3. EMR election (based on EMR-Merging, arXiv 2405.17461, NeurIPS 2024): Sign from weighted-sum consensus, amplitude from max abs across sources. Each parameter gets the strongest signal from whichever source specialized most.

The merge script also supports GPU-accelerated computation (chunks offloaded to CUDA for ~35x speedup over CPU-only).

Not yet implemented (available in the script for future iterations):

  • Fisher weighting (based on Fisher-Merging, Matena & Raffel 2022): Per-parameter adaptive weighting using diagonal Fisher information. Requires a calibration pre-computation step per source model. Currently uses fixed source weights.

Merge Configuration

python dare_ties_merge.py \
 --base Qwen/Qwen3.5-27B \
 --source Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled \
 --source ValiantLabs/Qwen3.5-27B-Esper3.1 \
 --source Jackrong/Qwen3.5-27B-Gemini-3.1-Pro-Reasoning-Distill \
 --method omnimerge_v2 --density 0.53 --weights 0.40,0.35,0.25 \
 --darex-q 0.75 --seed 42

Source Models

Source Weight Focus
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled 0.40 Claude 4.6 Opus reasoning distillation
ValiantLabs/Qwen3.5-27B-Esper3.1 0.35 Code / DevOps specialist
Jackrong/Qwen3.5-27B-Gemini-3.1-Pro-Reasoning-Distill 0.25 Gemini 3.1 Pro reasoning distillation

Base: Qwen/Qwen3.5-27B

Usage

llama.cpp (recommended)

llama-server -m Qwen3.5-27B-Omnimerge-v2-Q6_K.gguf -c 32768 -ngl 99 \
 --reasoning-format deepseek --reasoning-budget 16384 \
 --temp 0.6 --top-p 0.95 --top-k 20 --dry-multiplier 0.5

Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
 "ManniX-ITA/Qwen3.5-27B-Omnimerge-v2",
 torch_dtype=torch.bfloat16,
 device_map="auto",
)
tok = AutoTokenizer.from_pretrained("ManniX-ITA/Qwen3.5-27B-Omnimerge-v2")

Related Models

Model Description
Qwen3.5-27B-Omnimerge v1 (DARE-TIES baseline)
Qwen3.5-27B-Omnimerge-GGUF v1 GGUF quants
Qwen3.5-27B-Omnimerge-v2-GGUF v2 GGUF quants

License

Apache-2.0

Downloads last month
29
Safetensors
Model size
28B params
Tensor type
BF16
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ManniX-ITA/Qwen3.5-27B-Omnimerge-v2

Base model

Qwen/Qwen3.5-27B
Finetuned
(279)
this model
Quantizations
1 model

Papers for ManniX-ITA/Qwen3.5-27B-Omnimerge-v2