cosmo2-tokenizer
Tokenizer for the training of cosmo2. This tokenizer was trained on 1M samples from:
- FineWeb-Edu 70%
- Cosmopedia v2 15%
- StarCoderData 8%
- OpenWebMath 5%
- StackOverFlow 2%
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
