Koto Small 7B (Pretrained)

Koto-Small-7B-PT is a version of MiMo-7B-Base trained on almost a billion tokens of creative writing data.

Please check out Aurore-Reveil/Koto-Small-7B-IT, it's the official RP and instruct tune!

Usage

This model is not intended for use outside of raw text completion settings, such as cowriting. Instruct will not work. Multi-turn roleplay will not work.

It was trained at 32k, but as not all samples were this long, we expect that in the best case you can get ~16k effective context.

We found that 1.25 temperature and 0.05 min_p worked best, but YMMV!

Datasets

Some of the data used to train this model includes:

Most of The Anarchist Library, a repository for anarchist manifestos and writing (see allura-org/the-anarchist-library)
A random sample of public domain books from Project Gutenberg
Furry (anthro and feral) storytelling and smut
A small subset of known high-quality books and story data

Acknowledgements

thank you to [unk] for drawing the art used in the model card!
thank you very much to mango/deltavector for providing the compute used to train this model
thanks to curse for testing, ideas
thanks to toasty for some data, ideas
thanks to everyone else in allura for moral support

ilya <3

Call for Help

if you would like to help build on this model (instruct/RP SFT, further annealing on higher quality data, etc)...
please join our discord or our matrix! <3

Technical Appendix

Downloads last month: 15

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for allura-org/Koto-Small-7B-PT

Base model

XiaomiMiMo/MiMo-7B-Base

Finetuned

(1)

this model

Finetunes

2 models

Quantizations

3 models

URL: https://huggingface.co/allura-org/Koto-Small-7B-PT