A newer version of this model is available: qikp/kite-4.2-14m
Kite
🎉 You are looking at Kite 4, which is now even more efficient and uses a different dataset, as well as pika 4!
Kite is a small, trained, 8 million parameter language model.
Training
It was trained on a tokenized version of qikp/small-data, which is a mixture of various datasets, using 1 epoch, 32 batch size, 1.5e-4 learning rate, and the pika 4 tokenizer.
Limitations
Due to its size, the model is not suitable for production workloads.
- Downloads last month
- 96
Safetensors
Model size
8.69M params
Tensor type
F32
·
