Koto Small 7B (Pretrained)
Koto-Small-7B-PT is a version of MiMo-7B-Base trained on almost a billion tokens of creative writing data.
Please check out Aurore-Reveil/Koto-Small-7B-IT, it's the official RP and instruct tune!
Usage
This model is not intended for use outside of raw text completion settings, such as cowriting. Instruct will not work. Multi-turn roleplay will not work.
It was trained at 32k, but as not all samples were this long, we expect that in the best case you can get ~16k effective context.
We found that 1.25 temperature and 0.05 min_p worked best, but YMMV!
Datasets
Some of the data used to train this model includes:
- Most of The Anarchist Library, a repository for anarchist manifestos and writing (see allura-org/the-anarchist-library)
- A random sample of public domain books from Project Gutenberg
- Furry (anthro and feral) storytelling and smut
- A small subset of known high-quality books and story data
Acknowledgements
- thank you to [unk] for drawing the art used in the model card!
- thank you very much to mango/deltavector for providing the compute used to train this model
- thanks to curse for testing, ideas
- thanks to toasty for some data, ideas
- thanks to everyone else in allura for moral support
ilya <3
Call for Help
if you would like to help build on this model (instruct/RP SFT, further annealing on higher quality data, etc)...
please join our discord or our matrix! <3
Technical Appendix
- Downloads last month
- 15
Model tree for allura-org/Koto-Small-7B-PT
Base model
XiaomiMiMo/MiMo-7B-Base