smol_llama-101M-GQA: python
๐ Open In Colab400MB of buzz: pure Python programming nectar! ๐ฏ
This model is the general pre-trained checkpoint BEE-spoke-data/smol_llama-101M-GQA trained on a deduped version of pypi for +1 epoch. Play with the model in this demo space.
- Its architecture is the same as the base, with some new Python-related tokens added to vocab prior to training.
- It can generate basic Python code and markdown in README style, but will struggle with harder planning/reasoning tasks
- This is an experiment to test the abilities of smol-sized models in code generation; meaning both its capabilities and limitations
Use with care & understand that there may be some bugs ๐ still to be worked out.
Usage
๐ Be sure to note:
- The model uses the "slow" llama2 tokenizer. Set use_fast=False when loading the tokenizer.
- Use transformers library version 4.33.3 due to a known issue in version 4.34.1 (at time of writing)
Which llama2 tokenizer the API widget uses is an age-old mystery, and may cause minor whitespace issues (widget only).
To install the necessary packages and load the model:
# Install necessary packages
# pip install transformers==4.33.3 accelerate sentencepiece
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(
"BEE-spoke-data/smol_llama-101M-GQA-python",
use_fast=False,
)
model = AutoModelForCausalLM.from_pretrained(
"BEE-spoke-data/smol_llama-101M-GQA-python",
device_map="auto",
)
# The model can now be used as any other decoder
longer code-gen example
Below is a quick script that can be used as a reference/starting point for writing your own, better one :)
- Downloads last month
- 5
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Model tree for BEE-spoke-data/smol_llama-101M-GQA-python
Quantizations
1 model