VOOZH about

URL: https://huggingface.co/beberik/Nyxene-11B

⇱ beberik/Nyxene-11B · Hugging Face


Description

This repo contains bf16 files of Nyxene-11B. Like OmniMix but with new models.

Model used

Prompt template

The best one after further testing is this one:

<|system|>
Below is an instruction that describes a task. Write a response that appropriately completes the request.
<|user|>
{prompt}
<|assistant|>

The secret sauce

dolphin-juanako-11B :

slices:
 - sources:
 - model: fblgit/juanako-7b-UNA
 layer_range: [0, 24]
 - sources:
 - model: ehartford/dolphin-2.1-mistral-7b
 layer_range: [8, 32]
merge_method: passthrough
dtype: bfloat16

Starling-NeuralHermes-11B :

slices:
 - sources:
 - model: berkeley-nest/Starling-LM-7B-alpha
 layer_range: [0, 24]
 - sources:
 - model: mlabonne/NeuralHermes-2.5-Mistral-7B
 layer_range: [8, 32]
merge_method: passthrough
dtype: bfloat16

Nyxene-11B :

slices:
 - sources:
 - model: dolphin-juanako-11B
 layer_range: [0, 48]
 - model: Starling-NeuralHermes-11B
 layer_range: [0, 48]
merge_method: slerp
base_model: dolphin-juanako-11B
parameters:
 t:
 - filter: lm_head 
 value: [0.75]
 - filter: embed_tokens
 value: [0.75]
 - filter: self_attn
 value: [0.75, 0.25]
 - filter: mlp
 value: [0.25, 0.75]
 - filter: layernorm
 value: [0.5, 0.5]
 - filter: modelnorm
 value: [0.75]
 - value: 0.5 # fallback for rest of tensors
dtype: bfloat16

I use mergekit for all the manipulation told here.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 67.72
AI2 Reasoning Challenge (25-Shot) 68.34
HellaSwag (10-Shot) 84.54
MMLU (5-Shot) 65.09
TruthfulQA (0-shot) 57.50
Winogrande (5-shot) 79.08
GSM8k (5-shot) 51.78
Downloads last month
85
Safetensors
Model size
11B params
Tensor type
BF16
·

Model tree for beberik/Nyxene-11B

Quantizations
3 models

Evaluation results