DeepSeek-R1-Distill-Llama-3B

This model is the distilled version of DeepSeek-R1 on Llama-3.2-3B with R1-Distill-SFT dataset.

Prompt Template

You can use Llama3 prompt template while using the model:

Llama3

<|start_header_id|>system<|end_header_id|>
{system}<|eot_id|>

<|start_header_id|>user<|end_header_id|>
{user}<|eot_id|>

<|start_header_id|>assistant<|end_header_id|>
{assistant}<|eot_id|>

Example usage:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
 "suayptalha/DeepSeek-R1-Distill-Llama-3B",
 device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained("suayptalha/DeepSeek-R1-Distill-Llama-3B")

SYSTEM_PROMPT = """Respond in the following format:
<think>
You should reason between these tags.
</think>

Answer goes here...

Always use <think> </think> tags even if they are not necessary.
"""

messages = [
 {"role": "system", "content": SYSTEM_PROMPT},
 {"role": "user", "content": "Which one is larger? 9.11 or 9.9?"},
]
inputs = tokenizer.apply_chat_template(
 messages,
 tokenize = True,
 add_generation_prompt = True,
 return_tensors = "pt",
).to("cuda")
output = model.generate(input_ids=inputs, max_new_tokens=256, use_cache=True, temperature=0.7)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=False)
print(decoded_output)

Output:

<think>
First, I need to compare the two numbers 9.11 and 9.9. 

Next, I'll analyze each number. The first digit after the decimal point in 9.11 is 1, and in 9.9, it's 9. 

Since 9 is greater than 1, 9.9 is larger than 9.11.
</think>

To determine which number is larger, let's compare the two numbers:

**9.11** and **9.9**

1. **Identify the Decimal Places:**
 - Both numbers have two decimal places.
 
2. **Compare the Tens Place (Right of the Decimal Point):**
 - **9.11:** The tens place is 1.
 - **9.9:** The tens place is 9.
 
3. **Conclusion:**
 - Since 9 is greater than 1, the number with the larger tens place is 9.9.
 
**Answer:** **9.9** is larger than **9.11**.

Suggested system prompt:

Respond in the following format:
<think>
You should reason between these tags.
</think>

Answer goes here...

Always use <think> </think> tags even if they are not necessary.

Parameters

lr: 2e-5
epochs: 1
batch_size: 16
optimizer: paged_adamw_8bit

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	23.27
IFEval (0-Shot)	70.93
BBH (3-Shot)	21.45
MATH Lvl 5 (4-Shot)	20.92
GPQA (0-shot)	1.45
MuSR (0-shot)	2.91
MMLU-PRO (5-shot)	21.98

License

Support

👁 Buy Me A Coffee

Downloads last month: 2,339

Safetensors

Model size

3B params

Tensor type

F16

Model tree for suayptalha/DeepSeek-R1-Distill-Llama-3B

Base model

meta-llama/Llama-3.2-3B-Instruct

Finetuned

(1643)

this model

Finetunes

3 models

Quantizations

4 models

Dataset used to train suayptalha/DeepSeek-R1-Distill-Llama-3B

Evaluation results

strict accuracy on IFEval (0-Shot)
Open LLM Leaderboard
70.930
normalized accuracy on BBH (3-Shot)
Open LLM Leaderboard
21.450
exact match on MATH Lvl 5 (4-Shot)
Open LLM Leaderboard
20.920
acc_norm on GPQA (0-shot)
Open LLM Leaderboard
1.450
acc_norm on MuSR (0-shot)
Open LLM Leaderboard
2.910
accuracy on MMLU-PRO (5-shot)
test set Open LLM Leaderboard
21.980

URL: https://huggingface.co/suayptalha/DeepSeek-R1-Distill-Llama-3B

⇱ suayptalha/DeepSeek-R1-Distill-Llama-3B · Hugging Face