smol_llama 220M fine-tunes we did • 6 items • Updated • 2
BEE-spoke-data/beecoder-220M-python
This is BEE-spoke-data/smol_llama-220M-GQA fine-tuned for code generation on:
- filtered version of stack-smol-XL
- deduped version of 'algebraic stack' from proof-pile-2
- cleaned and deduped pypi (last dataset)
This model (and the base model) were both trained using ctx length 2048.
examples
Example script for inference testing: here
It has its limitations at 220M, but seems decent for single-line or docstring generation, and/or being used for speculative decoding for such purposes.
The screenshot is on CPU on a laptop.
- Downloads last month
- 42
Safetensors
Model size
0.2B params
Tensor type
BF16
·
