VOOZH

URL: https://huggingface.co/BEE-spoke-data/smol_llama-220M-GQA

⇱ BEE-spoke-data/smol_llama-220M-GQA · Hugging Face

smol_llama: 220M GQA

A small 220M param (total) decoder model. This is the first version of the model.

1024 hidden size, 10 layers
GQA (32 heads, 8 key-value), context length 2048
train-from-scratch on one GPU :)

Links

Here are some fine-tunes we did, but there are many more possibilities out there!

instruct
- openhermes - link
- open-instruct - link
code
- python (pypi) - link
zephyr DPO tune
- SFT - link
- full DPO - link

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	29.44
AI2 Reasoning Challenge (25-Shot)	24.83
HellaSwag (10-Shot)	29.76
MMLU (5-Shot)	25.85
TruthfulQA (0-shot)	44.55
Winogrande (5-shot)	50.99
GSM8k (5-shot)	0.68

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	6.62
IFEval (0-Shot)	23.86
BBH (3-Shot)	3.04
MATH Lvl 5 (4-Shot)	0.00
GPQA (0-shot)	0.78
MuSR (0-shot)	9.07
MMLU-PRO (5-shot)	1.66

Downloads last month: 371

Safetensors

Model size

0.2B params

Tensor type

BF16

·

Model tree for BEE-spoke-data/smol_llama-220M-GQA

Finetunes

Merges

Quantizations

Datasets used to train BEE-spoke-data/smol_llama-220M-GQA

Collection including BEE-spoke-data/smol_llama-220M-GQA

🚧"raw" pretrained smol_llama checkpoints - WIP 🚧 • 4 items • Updated Apr 29, 2024 • 6

Evaluation results

normalized accuracy on AI2 Reasoning Challenge (25-Shot)
test set Open LLM Leaderboard
24.830
normalized accuracy on HellaSwag (10-Shot)
validation set Open LLM Leaderboard
29.760
accuracy on MMLU (5-Shot)
test set Open LLM Leaderboard
25.850
mc2 on TruthfulQA (0-shot)
validation set Open LLM Leaderboard
44.550
accuracy on Winogrande (5-shot)
validation set Open LLM Leaderboard
50.990
accuracy on GSM8k (5-shot)
test set Open LLM Leaderboard
0.680
strict accuracy on IFEval (0-Shot)
Open LLM Leaderboard
23.860
normalized accuracy on BBH (3-Shot)
Open LLM Leaderboard
3.040