workspace/aibox-standalone-pool/axolotl/glitterms32-v2-ckpts

This model is a fine-tuned version of Gryphe/Codex-24B-Small-3.2 on the ToastyPigeon/cowriter-instruct, the allura-org/EU01-S2, the allenai/tulu-3-sft-personas-instruction-following, the ToastyPigeon/mixed-medical-reasoning-formatted, the ToastyPigeon/steve-and-marvin, the ToastyPigeon/new-story-dataset, the allura-org/fujin-instruct-v2, the ToastyPigeon/some-rp-extended, the ToastyPigeon/gutenberg-sft, the ToastyPigeon/SpringDragon and the ToastyPigeon/some-erotica datasets.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 1
eval_batch_size: 1
seed: 69
distributed_type: multi-GPU
num_devices: 2
gradient_accumulation_steps: 8
total_train_batch_size: 16
total_eval_batch_size: 2
optimizer: Use OptimizerNames.PAGED_ADAMW_8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
training_steps: 10

Training results

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.7.0+cu128
Datasets 3.5.1
Tokenizers 0.21.1

Downloads last month: 5

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ToastyPigeon/sparkly-3.2-train

Base model

mistralai/Mistral-Small-3.1-24B-Base-2503

Finetuned

mistralai/Mistral-Small-3.2-24B-Instruct-2506

Finetuned

Gryphe/Codex-24B-Small-3.2

Adapter

(2)

this model

URL: https://huggingface.co/ToastyPigeon/sparkly-3.2-train

⇱ ToastyPigeon/sparkly-3.2-train · Hugging Face

workspace/aibox-standalone-pool/axolotl/glitterms32-v2-ckpts

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ToastyPigeon/sparkly-3.2-train

Datasets used to train ToastyPigeon/sparkly-3.2-train