VOOZH about

URL: https://huggingface.co/Swekerr/llama1b_dapo_math_grpo-v1.1

⇱ Swekerr/llama1b_dapo_math_grpo-v1.1 · Hugging Face


Uploaded finetuned model

  • Developed by: Swekerr
  • License: apache-2.0
  • Finetuned from model : unsloth/llama-3.2-1b-instruct-unsloth-bnb-4bit

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

👁 Image

Downloads last month
3
Safetensors
Model size
1B params
Tensor type
BF16
·

Datasets used to train Swekerr/llama1b_dapo_math_grpo-v1.1