VOOZH about

URL: https://huggingface.co/sthenno/tempesthenno-nuslerp-001

⇱ sthenno/tempesthenno-nuslerp-001 · Hugging Face


A newer version of this model is available: sthenno/tempesthenno-14b-nuslerp-0111

tempesthenno--nuslerp

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the NuSLERP merge method.

Models Merged

The following models were included in the merge:

  • /Users/sthenno/models/tempesthenno--converge-dtask
  • /Users/sthenno/models/tempesthenno--converge-breadcrumbs

Configuration

The following YAML configuration was used to produce this model:

name: tempesthenno--nuslerp
merge_method: nuslerp
tokenizer:
 source: /Users/sthenno/models/tempesthenno--converge-dtask
chat_template: "chatml"
dtype: float32
out_dtype: bfloat16
parameters:
 int8_mask: false
 normalize: true
 rescale: false
slices:
 - sources:
 - model: /Users/sthenno/models/tempesthenno--converge-dtask
 layer_range: [0, 8]
 parameters:
 weight: 0.65
 nuslerp_flatten: false
 nuslerp_row_wise: true
 - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
 layer_range: [0, 8]
 parameters:
 weight: 0.35
 nuslerp_flatten: false
 nuslerp_row_wise: true
 - sources:
 - model: /Users/sthenno/models/tempesthenno--converge-dtask
 layer_range: [8, 16]
 parameters:
 weight: 0.60
 nuslerp_flatten: false
 nuslerp_row_wise: true
 - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
 layer_range: [8, 16]
 parameters:
 weight: 0.40
 nuslerp_flatten: false
 nuslerp_row_wise: true
 - sources:
 - model: /Users/sthenno/models/tempesthenno--converge-dtask
 layer_range: [16, 24]
 parameters:
 weight: 0.55
 nuslerp_flatten: false
 nuslerp_row_wise: false
 - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
 layer_range: [16, 24]
 parameters:
 weight: 0.45
 nuslerp_flatten: false
 nuslerp_row_wise: false
 - sources:
 - model: /Users/sthenno/models/tempesthenno--converge-dtask
 layer_range: [24, 32]
 parameters:
 weight: 0.50
 nuslerp_flatten: false
 nuslerp_row_wise: false
 - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
 layer_range: [24, 32]
 parameters:
 weight: 0.50
 nuslerp_flatten: false
 nuslerp_row_wise: false
 - sources:
 - model: /Users/sthenno/models/tempesthenno--converge-dtask
 layer_range: [32, 40]
 parameters:
 weight: 0.45
 nuslerp_flatten: true
 - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
 layer_range: [32, 40]
 parameters:
 weight: 0.55
 nuslerp_flatten: true
 - sources:
 - model: /Users/sthenno/models/tempesthenno--converge-dtask
 layer_range: [40, 48]
 parameters:
 weight: 0.40
 nuslerp_flatten: true
 - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
 layer_range: [40, 48]
 parameters:
 weight: 0.60
 nuslerp_flatten: true

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 39.94
IFEval (0-Shot) 79.26
BBH (3-Shot) 51.04
MATH Lvl 5 (4-Shot) 31.72
GPQA (0-shot) 16.44
MuSR (0-shot) 13.88
MMLU-PRO (5-shot) 47.30
Downloads last month
10
Safetensors
Model size
15B params
Tensor type
BF16
·

Model tree for sthenno/tempesthenno-nuslerp-001

Space using sthenno/tempesthenno-nuslerp-001 1

Evaluation results