Omega 2.6B
This model is derived from phi 1.3B using layer stacking techniques to double the number of hidden layers in the model. The model was then trained for 1 epoch on data from tiny-textbooks and tiny-lessons.
Training
- Downloads last month
- 10
![]() |
VOOZH | about |
This model is derived from phi 1.3B using layer stacking techniques to double the number of hidden layers in the model. The model was then trained for 1 epoch on data from tiny-textbooks and tiny-lessons.