VOOZH about

URL: https://huggingface.co/Mozilla/distilvit

⇱ Mozilla/distilvit · Hugging Face


distilvit

This model is a work in progress. Fine-tuned version of those base models:

This model was trained on:

You can find the code used to create the model here: https://github.com/mozilla/distilvit

training results

{
 "train/loss": 0.0781,
 "train/learning_rate": 0.00003793103448275862,
 "train/epoch": 2.41,
 "train/global_step": 700,
 "eval/loss": 0.09741172194480896,
 "eval/rouge1": 60.382,
 "eval/rouge2": 38.0754,
 "eval/rougeL": 56.9132,
 "eval/rougeLsum": 56.9214,
 "eval/meteor": 0.5448683804505693,
 "eval/gen_len": 9.864678265672467,
 "eval/runtime": 343.0443,
 "eval/samples_per_second": 10.555,
 "eval/steps_per_second": 0.108,
 "train/train_runtime": 10567.9413,
 "train/train_samples_per_second": 27.414,
 "train/train_steps_per_second": 0.274,
 "train/total_flos": 9039628706135409000,
 "train/train_loss": 0.09852950266429356,
}
Downloads last month
273
Safetensors
Model size
0.2B params
Tensor type
F32
·

Model tree for Mozilla/distilvit

Quantized
(10)
this model
Quantizations
1 model

Dataset used to train Mozilla/distilvit