smol_llama-101M-GQA: python

400MB of buzz: pure Python programming nectar! 🍯

This model is the general pre-trained checkpoint BEE-spoke-data/smol_llama-101M-GQA trained on a deduped version of pypi for +1 epoch. Play with the model in this demo space.

Its architecture is the same as the base, with some new Python-related tokens added to vocab prior to training.
It can generate basic Python code and markdown in README style, but will struggle with harder planning/reasoning tasks
This is an experiment to test the abilities of smol-sized models in code generation; meaning both its capabilities and limitations

Use with care & understand that there may be some bugs 🐛 still to be worked out.

Usage

📌 Be sure to note:

The model uses the "slow" llama2 tokenizer. Set use_fast=False when loading the tokenizer.
Use transformers library version 4.33.3 due to a known issue in version 4.34.1 (at time of writing)

Which llama2 tokenizer the API widget uses is an age-old mystery, and may cause minor whitespace issues (widget only).

To install the necessary packages and load the model:

# Install necessary packages
# pip install transformers==4.33.3 accelerate sentencepiece

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(
 "BEE-spoke-data/smol_llama-101M-GQA-python",
 use_fast=False,
)
model = AutoModelForCausalLM.from_pretrained(
 "BEE-spoke-data/smol_llama-101M-GQA-python",
 device_map="auto",
)

# The model can now be used as any other decoder

longer code-gen example

Below is a quick script that can be used as a reference/starting point for writing your own, better one :)

Downloads last month: 5

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for BEE-spoke-data/smol_llama-101M-GQA-python

Quantizations

1 model

URL: https://huggingface.co/BEE-spoke-data/smol_llama-101M-GQA-python

⇱ BEE-spoke-data/smol_llama-101M-GQA-python · Hugging Face

smol_llama-101M-GQA: python

Usage

longer code-gen example

Model tree for BEE-spoke-data/smol_llama-101M-GQA-python

Space using BEE-spoke-data/smol_llama-101M-GQA-python 1