This is the fully trained version (with fixed formatting!!).

Dataset used: Gryphe/Sonnet3.5-SlimOrcaDedupCleaned which was further filtered to remove prompts/examples that are longer than 4076 tokens (removed about 385 examples).

Prompt format is: ChatML

Trained with regular LoRA (not quantized/QLoRA) and LoRA rank was 128 and Alpha set to 32. Trained for 1 epoch using A40 for about 23 hours.

Uploaded model

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for mpasila/Viking-SlimSonnet-v1-7B

Base model

Finetuned

(29)

this model

Merges

Quantizations