VOOZH about

URL: https://huggingface.co/allura-forge/micro-glitter

⇱ allura-forge/micro-glitter · Hugging Face


👁 Built with Axolotl


micro-glitter

This model is a fine-tuned version of unsloth/gemma-3-270m-it on the allura-org/EU01-S2, the allenai/tulu-3-sft-personas-instruction-following, the ToastyPigeon/mixed-medical-reasoning-formatted, the ToastyPigeon/steve-and-marvin, the ToastyPigeon/kimi-stories-instruct, the ToastyPigeon/new-story-dataset, the allura-org/fujin-instruct-v2, the ToastyPigeon/gutenberg-sft, the ToastyPigeon/SpringDragon and the ToastyPigeon/some-erotica datasets. It achieves the following results on the evaluation set:

  • Loss: 3.7387

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 69
  • distributed_type: multi-GPU
  • num_devices: 2
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • total_eval_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 8
  • training_steps: 296

Training results

Training Loss Epoch Step Validation Loss
No log 0 0 3.8582
3.4802 0.1008 15 3.5118
3.4608 0.2017 30 3.4890
3.5272 0.3025 45 3.5189
3.559 0.4034 60 3.5753
3.5817 0.5042 75 3.6121
3.6349 0.6050 90 3.6471
3.68 0.7059 105 3.6721
3.6597 0.8067 120 3.6970
3.6462 0.9076 135 3.7068
3.7009 1.0067 150 3.7213
3.6717 1.1076 165 3.7313
3.7631 1.2084 180 3.7338
3.7535 1.3092 195 3.7346
3.668 1.4101 210 3.7375
3.679 1.5109 225 3.7383
3.6539 1.6118 240 3.7386
3.6547 1.7126 255 3.7386
3.7533 1.8134 270 3.7400
3.6983 1.9143 285 3.7387

Framework versions

  • Transformers 4.52.4
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
0.3B params
Tensor type
BF16
·

Model tree for allura-forge/micro-glitter

Finetuned
(414)
this model
Quantizations
2 models

Datasets used to train allura-forge/micro-glitter