VOOZH
about
URL: https://huggingface.co/Swekerr/llama1b_dapo_math_grpo-v1
⇱ Swekerr/llama1b_dapo_math_grpo-v1 · Hugging Face
Uploaded finetuned model
Developed by:
Swekerr
License:
apache-2.0
Finetuned from model :
unsloth/llama-3.2-1b-instruct
This llama model was trained 2x faster with
Unsloth
and Huggingface's TRL library.
👁 Image
Downloads last month
5
Safetensors
Model size
1B params
Tensor type
BF16
·
Datasets used to train
Swekerr/llama1b_dapo_math_grpo-v1