VOOZH about

URL: https://huggingface.co/mlabonne/Daredevil-8B

โ‡ฑ mlabonne/Daredevil-8B ยท Hugging Face


Daredevil-8B

๐Ÿ‘ image/jpeg

Daredevil-8B is a mega-merge designed to maximize MMLU. On 27 May 24, it is the Llama 3 8B model with the highest MMLU score. From my experience, a high MMLU score is all you need with Llama 3 models.

It is a merge of the following models using LazyMergekit:

Thanks to nbeerbower, Hastagaras, openchat, Kukedlc, cstr, flammenai, and KingNish for their merges. Special thanks to Charles Goddard and Arcee.ai for MergeKit.

๐Ÿ”Ž Applications

You can use it as an improved version of meta-llama/Meta-Llama-3-8B-Instruct.

This is a censored model. For an uncensored version, see mlabonne/Daredevil-8B-abliterated.

Tested on LM Studio using the "Llama 3" preset.

โšก Quantization

๐Ÿ† Evaluation

Open LLM Leaderboard

Daredevil-8B is the best-performing 8B model on the Open LLM Leaderboard in terms of MMLU score (27 May 24).

๐Ÿ‘ image/png

Nous

Daredevil-8B is the best-performing 8B model on Nous' benchmark suite (evaluation performed using LLM AutoEval, 27 May 24). See the entire leaderboard here.

Model Average AGIEval GPT4All TruthfulQA Bigbench
mlabonne/Daredevil-8B ๐Ÿ“„ 55.87 44.13 73.52 59.05 46.77
mlabonne/Daredevil-8B-abliterated ๐Ÿ“„ 55.06 43.29 73.33 57.47 46.17
mlabonne/Llama-3-8B-Instruct-abliterated-dpomix ๐Ÿ“„ 52.26 41.6 69.95 54.22 43.26
meta-llama/Meta-Llama-3-8B-Instruct ๐Ÿ“„ 51.34 41.22 69.86 51.65 42.64
failspy/Meta-Llama-3-8B-Instruct-abliterated-v3 ๐Ÿ“„ 51.21 40.23 69.5 52.44 42.69
mlabonne/OrpoLlama-3-8B ๐Ÿ“„ 48.63 34.17 70.59 52.39 37.36
meta-llama/Meta-Llama-3-8B ๐Ÿ“„ 45.42 31.1 69.95 43.91 36.7

๐ŸŒณ Model family tree

๐Ÿ‘ image/png

๐Ÿงฉ Configuration

models:
 - model: NousResearch/Meta-Llama-3-8B
 # No parameters necessary for base model
 - model: nbeerbower/llama-3-stella-8B
 parameters:
 density: 0.6
 weight: 0.16
 - model: Hastagaras/llama-3-8b-okay
 parameters:
 density: 0.56
 weight: 0.1
 - model: nbeerbower/llama-3-gutenberg-8B
 parameters:
 density: 0.6
 weight: 0.18
 - model: openchat/openchat-3.6-8b-20240522
 parameters:
 density: 0.56
 weight: 0.12
 - model: Kukedlc/NeuralLLaMa-3-8b-DT-v0.1
 parameters:
 density: 0.58
 weight: 0.18
 - model: cstr/llama3-8b-spaetzle-v20
 parameters:
 density: 0.56
 weight: 0.08
 - model: mlabonne/ChimeraLlama-3-8B-v3
 parameters:
 density: 0.56
 weight: 0.08
 - model: flammenai/Mahou-1.1-llama3-8B
 parameters:
 density: 0.55
 weight: 0.05
 - model: KingNish/KingNish-Llama3-8b
 parameters:
 density: 0.55
 weight: 0.05
merge_method: dare_ties
base_model: NousResearch/Meta-Llama-3-8B
dtype: bfloat16

๐Ÿ’ป Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "mlabonne/Daredevil-8B"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
 "text-generation",
 model=model,
 torch_dtype=torch.bfloat16,
 device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
Downloads last month
202
Safetensors
Model size
8B params
Tensor type
BF16
ยท

Model tree for mlabonne/Daredevil-8B

Spaces using mlabonne/Daredevil-8B 9

Collection including mlabonne/Daredevil-8B

Article mentioning mlabonne/Daredevil-8B

Evaluation results