VOOZH about

URL: https://huggingface.co/dfurman/Llama-3-8B-Orpo-v0.1

โ‡ฑ dfurman/Llama-3-8B-Orpo-v0.1 ยท Hugging Face


dfurman/Llama-3-8B-Orpo-v0.1

๐Ÿ‘ Image

This is an ORPO fine-tune of meta-llama/Meta-Llama-3-8B on 4k samples of mlabonne/orpo-dpo-mix-40k.

It's a successful fine-tune that follows the ChatML template!

๐Ÿ”Ž Application

This model uses a context window of 8k. It was trained with the ChatML template.

๐Ÿ† Evaluation

Open LLM Leaderboard

Model ID Average ARC HellaSwag MMLU TruthfulQA Winogrande GSM8K
meta-llama/Meta-Llama-3-8B-Instruct ๐Ÿ“„ 66.87 60.75 78.55 67.07 51.65 74.51 68.69
dfurman/Llama-3-8B-Orpo-v0.1 ๐Ÿ“„ 64.67 60.67 82.56 66.59 50.47 79.01 48.75
meta-llama/Meta-Llama-3-8B ๐Ÿ“„ 62.35 59.22 82.02 66.49 43.95 77.11 45.34

๐Ÿ“ˆ Training curves

You can find the experiment on W&B at this address.

๐Ÿ’ป Usage

Run

messages = [
 {"role": "system", "content": "You are a helpful assistant."},
 {"role": "user", "content": "Tell me a recipe for a spicy margarita."},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print("***Prompt:\n", prompt)

outputs = pipeline(prompt, max_new_tokens=1000, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print("***Generation:\n", outputs[0]["generated_text"][len(prompt):])
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_dfurman__Llama-3-8B-Orpo-v0.1)
Metric Value
Avg. 11.01
IFEval (0-Shot) 30.00
BBH (3-Shot) 13.77
MATH Lvl 5 (4-Shot) 3.78
GPQA (0-shot) 1.57
MuSR (0-shot) 2.73
MMLU-PRO (5-shot) 14.23
Downloads last month
7,197
Safetensors
Model size
8B params
Tensor type
F16
ยท

Model tree for dfurman/Llama-3-8B-Orpo-v0.1

Finetuned
(599)
this model
Quantizations
3 models

Dataset used to train dfurman/Llama-3-8B-Orpo-v0.1

Space using dfurman/Llama-3-8B-Orpo-v0.1 1

Collection including dfurman/Llama-3-8B-Orpo-v0.1

Evaluation results