VOOZH about

URL: https://huggingface.co/BEE-spoke-data/smol_llama-101M-GQA-python

โ‡ฑ BEE-spoke-data/smol_llama-101M-GQA-python ยท Hugging Face


smol_llama-101M-GQA: python

๐Ÿ‘ Open In Colab

400MB of buzz: pure Python programming nectar! ๐Ÿฏ

This model is the general pre-trained checkpoint BEE-spoke-data/smol_llama-101M-GQA trained on a deduped version of pypi for +1 epoch. Play with the model in this demo space.

  • Its architecture is the same as the base, with some new Python-related tokens added to vocab prior to training.
  • It can generate basic Python code and markdown in README style, but will struggle with harder planning/reasoning tasks
  • This is an experiment to test the abilities of smol-sized models in code generation; meaning both its capabilities and limitations

Use with care & understand that there may be some bugs ๐Ÿ› still to be worked out.

Usage

๐Ÿ“Œ Be sure to note:

  1. The model uses the "slow" llama2 tokenizer. Set use_fast=False when loading the tokenizer.
  2. Use transformers library version 4.33.3 due to a known issue in version 4.34.1 (at time of writing)

Which llama2 tokenizer the API widget uses is an age-old mystery, and may cause minor whitespace issues (widget only).

To install the necessary packages and load the model:

# Install necessary packages
# pip install transformers==4.33.3 accelerate sentencepiece

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(
 "BEE-spoke-data/smol_llama-101M-GQA-python",
 use_fast=False,
)
model = AutoModelForCausalLM.from_pretrained(
 "BEE-spoke-data/smol_llama-101M-GQA-python",
 device_map="auto",
)

# The model can now be used as any other decoder

longer code-gen example

Below is a quick script that can be used as a reference/starting point for writing your own, better one :)


Downloads last month
5
Safetensors
Model size
0.1B params
Tensor type
F32
ยท

Model tree for BEE-spoke-data/smol_llama-101M-GQA-python

Quantizations
1 model

Space using BEE-spoke-data/smol_llama-101M-GQA-python 1