Qwen3.5-27B-Omnimerge-v2

An improved 3-way weight-space merge of Qwen3.5-27B reasoning-distilled fine-tunes using the Omnimerge v2 method — combining four recent advances in model merging.

GGUF quantizations available at ManniX-ITA/Qwen3.5-27B-Omnimerge-v2-GGUF

Benchmark Results (Q6_K)

Benchmark	Omnimerge v1	Omnimerge v2	Delta
GPQA Diamond (198q, flex)	61.11%	69.19%	+8.08 pp
MBPP pass@1	71.80%	74.60%	+2.80 pp
HumanEval pass@1	79.88%	79.27%	-0.61 pp

vs Best Source Model (Claude-distill)

Benchmark	Claude-distill	Omnimerge v2	Delta
GPQA Diamond (198q, flex)	53.03%	69.19%	+16.16 pp
MBPP pass@1	71.20%	74.60%	+3.40 pp
HumanEval pass@1	76.22%	79.27%	+3.05 pp

Method: Omnimerge v2

Four enhancements over standard DARE-TIES (v1):

OBIM-lite magnitude masking (based on OBIM, arXiv 2502.12217): Deterministic top-k masking by |delta| magnitude instead of random Bernoulli drop. Keeps the most informative parameter changes.
DAREx rescaling (based on DAREx, arXiv 2410.09344, ICLR 2025): Survivors divided by configurable q instead of density. Lower variance than standard DARE rescaling.
EMR election (based on EMR-Merging, arXiv 2405.17461, NeurIPS 2024): Sign from weighted-sum consensus, amplitude from max abs across sources. Each parameter gets the strongest signal from whichever source specialized most.

The merge script also supports GPU-accelerated computation (chunks offloaded to CUDA for ~35x speedup over CPU-only).

Not yet implemented (available in the script for future iterations):

Fisher weighting (based on Fisher-Merging, Matena & Raffel 2022): Per-parameter adaptive weighting using diagonal Fisher information. Requires a calibration pre-computation step per source model. Currently uses fixed source weights.

Merge Configuration

python dare_ties_merge.py \
 --base Qwen/Qwen3.5-27B \
 --source Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled \
 --source ValiantLabs/Qwen3.5-27B-Esper3.1 \
 --source Jackrong/Qwen3.5-27B-Gemini-3.1-Pro-Reasoning-Distill \
 --method omnimerge_v2 --density 0.53 --weights 0.40,0.35,0.25 \
 --darex-q 0.75 --seed 42

Source Models

Source	Weight	Focus
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled	0.40	Claude 4.6 Opus reasoning distillation
ValiantLabs/Qwen3.5-27B-Esper3.1	0.35	Code / DevOps specialist
Jackrong/Qwen3.5-27B-Gemini-3.1-Pro-Reasoning-Distill	0.25	Gemini 3.1 Pro reasoning distillation

Base: Qwen/Qwen3.5-27B

Usage

llama.cpp (recommended)

llama-server -m Qwen3.5-27B-Omnimerge-v2-Q6_K.gguf -c 32768 -ngl 99 \
 --reasoning-format deepseek --reasoning-budget 16384 \
 --temp 0.6 --top-p 0.95 --top-k 20 --dry-multiplier 0.5

Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
 "ManniX-ITA/Qwen3.5-27B-Omnimerge-v2",
 torch_dtype=torch.bfloat16,
 device_map="auto",
)
tok = AutoTokenizer.from_pretrained("ManniX-ITA/Qwen3.5-27B-Omnimerge-v2")

Related Models

Model	Description
Qwen3.5-27B-Omnimerge	v1 (DARE-TIES baseline)
Qwen3.5-27B-Omnimerge-GGUF	v1 GGUF quants
Qwen3.5-27B-Omnimerge-v2-GGUF	v2 GGUF quants

License

Apache-2.0

Downloads last month: 29

Safetensors

Model size

28B params

Tensor type

BF16

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ManniX-ITA/Qwen3.5-27B-Omnimerge-v2

Base model

Qwen/Qwen3.5-27B

Finetuned

(279)

this model

Quantizations

1 model

Papers for ManniX-ITA/Qwen3.5-27B-Omnimerge-v2

Paper • 2502.12217 • Published Feb 17, 2025

Paper • 2410.09344 • Published Oct 12, 2024 • 1

Paper • 2405.17461 • Published Sep 27, 2024

Paper • 2111.09832 • Published Nov 18, 2021 • 1

URL: https://huggingface.co/ManniX-ITA/Qwen3.5-27B-Omnimerge-v2

⇱ ManniX-ITA/Qwen3.5-27B-Omnimerge-v2 · Hugging Face