dfurman/Llama-3-8B-Orpo-v0.1

This is an ORPO fine-tune of meta-llama/Meta-Llama-3-8B on 4k samples of mlabonne/orpo-dpo-mix-40k.

It's a successful fine-tune that follows the ChatML template!

🔎 Application

This model uses a context window of 8k. It was trained with the ChatML template.

🏆 Evaluation

Open LLM Leaderboard

Model ID	Average	ARC	HellaSwag	MMLU	TruthfulQA	Winogrande	GSM8K
meta-llama/Meta-Llama-3-8B-Instruct 📄	66.87	60.75	78.55	67.07	51.65	74.51	68.69
dfurman/Llama-3-8B-Orpo-v0.1 📄	64.67	60.67	82.56	66.59	50.47	79.01	48.75
meta-llama/Meta-Llama-3-8B 📄	62.35	59.22	82.02	66.49	43.95	77.11	45.34

📈 Training curves

You can find the experiment on W&B at this address.

💻 Usage

Run

messages = [
 {"role": "system", "content": "You are a helpful assistant."},
 {"role": "user", "content": "Tell me a recipe for a spicy margarita."},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print("***Prompt:\n", prompt)

outputs = pipeline(prompt, max_new_tokens=1000, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print("***Generation:\n", outputs[0]["generated_text"][len(prompt):])

# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_dfurman__Llama-3-8B-Orpo-v0.1)

Metric	Value
Avg.	11.01
IFEval (0-Shot)	30.00
BBH (3-Shot)	13.77
MATH Lvl 5 (4-Shot)	3.78
GPQA (0-shot)	1.57
MuSR (0-shot)	2.73
MMLU-PRO (5-shot)	14.23

Downloads last month: 7,197

Safetensors

Model size

8B params

Tensor type

F16

Model tree for dfurman/Llama-3-8B-Orpo-v0.1

Base model

meta-llama/Meta-Llama-3-8B

Finetuned

(599)

this model

Quantizations

3 models

Dataset used to train dfurman/Llama-3-8B-Orpo-v0.1

Space using dfurman/Llama-3-8B-Orpo-v0.1 1

Collection including dfurman/Llama-3-8B-Orpo-v0.1

Latest and greatest finetunes • 10 items • Updated Mar 2 • 2

Evaluation results

strict accuracy on IFEval (0-Shot)
Open LLM Leaderboard
30.000
normalized accuracy on BBH (3-Shot)
Open LLM Leaderboard
13.770
exact match on MATH Lvl 5 (4-Shot)
Open LLM Leaderboard
3.780
acc_norm on GPQA (0-shot)
Open LLM Leaderboard
1.570
acc_norm on MuSR (0-shot)
Open LLM Leaderboard
2.730
accuracy on MMLU-PRO (5-shot)
test set Open LLM Leaderboard
14.230

URL: https://huggingface.co/dfurman/Llama-3-8B-Orpo-v0.1