VOOZH about

URL: https://huggingface.co/Aurore-Reveil/Koto-Small-7B-IT

⇱ Aurore-Reveil/Koto-Small-7B-IT · Hugging Face


Koto Small 7B (Instruct-Tuned)

👁 482629.png

Koto-Small-7B-IT is an instruct-tuned version of Koto-Small-7B-PT, which was trained on MiMo-7B-Base for almost a billion tokens of creative-writing data. This model is meant for roleplaying and instruct usecases.

Usage

Chat template

Trained with ChatML formatting, A typical input would look like this:

<|im_start|>system
system prompt<|im_end|>
<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
<|im_start|>assistant

Samplers

We found that 1.25 temperature and 0.05 min_p worked best, but YMMV!

Datasets

datasets:
 - path: Delta-Vector/Hydrus-General-Reasoning
 - path: Delta-Vector/Hydrus-IF-Mix-Ai2
 - path: Delta-Vector/Hydrus-Army-Inst
 - path: Delta-Vector/Hydrus-AM-thinking-Science
 - path: Delta-Vector/Hydrus-AM-Thinking-Code-Filtered
 - path: Delta-Vector/Hydrus-AM-Thinking-IF-No-Think
 - path: Delta-Vector/Hydrus-Tulu-SFT-Mix-V2
 - path: Delta-Vector/Hydrus-System-Chat-2.0
 - path: Delta-Vector/Orion-Praxis-Co-Writer
 - path: Delta-Vector/Orion-Co-Writer-51K
 - path: Delta-Vector/Orion-Creative_Writing-Complexity
 - path: Delta-Vector/Orion-vanilla-backrooms-claude-sharegpt
 - path: Delta-Vector/Hydrus-AM-Thinking-Multi-Turn
 - path: PocketDoc/Dans-Failuremaxx-Adventure
 - path: PocketDoc/Dans-Logicmaxx-SAT-AP
 - path: PocketDoc/Dans-MemoryCore-CoreCurriculum-Small
 - path: PocketDoc/Dans-Taskmaxx-DataPrepper
 - path: PocketDoc/Dans-Prosemaxx-Instructwriter-Long
 - path: PocketDoc/Dans-Prosemaxx-InstructWriter-ZeroShot-2
 - path: PocketDoc/Dans-Prosemaxx-InstructWriter-ZeroShot-3
 - path: PocketDoc/Dans-Prosemaxx-InstructWriter-Continue-2
 - path: PocketDoc/Dans-Systemmaxx

Acknowledgements

  • Thank you very much to Delta-Vector/Mango for providing the compute used to train this model.
  • Fizz for the pretrain.
  • Pocketdoc/Anthracite for da cool datasets.
  • Hensen chat.
  • Thank you to the illustrator of WataNare for drawing the art used in the model card!
  • Thanks to Curse for testing, ideas.
  • Thanks to Toasty for some data, ideas.
  • Thanks to everyone else in allura!

ilya <3

Call for Help

If you would like to help build on this model (RP SFT, further annealing on higher quality data, etc)...

Please join the allura discord or the matrix! <3

Technical Appendix

Downloads last month
77
Safetensors
Model size
8B params
Tensor type
BF16
·

Model tree for Aurore-Reveil/Koto-Small-7B-IT

Finetuned
(2)
this model
Adapters
5 models
Finetunes
2 models
Quantizations
9 models