Find all of the Glint models in one place! (Hint: its here ) • 6 items • Updated • 4
A newer version of this model is available: Glint-Research/Glint-1.3
Glint-0.1
Once upon a time, there was a model that could only say
couldcouldoldbloodbloodbodybody. This is its ancestor.
Glint-0.1 is where the Glint line started. 1M parameters. Big dreams. Almost no ability to realize them. We look back on this one fondly, like a blurry photo of a puppy that chewed your shoes.
What you get
| File | What it is |
|---|---|
tokenizer.json |
Hybrid word/char tokenizer (~2,111 tokens) |
pretrain.pt |
Base pretrained checkpoint |
model.pt |
Instruction-tuned checkpoint (SFT) |
Specs
| Thing | Value |
|---|---|
| Architecture | Transformer Decoder |
| Parameters | ~1 Million |
| Context | 2,048 tokens |
| d_model | 160 |
| Layers | 6 |
| Heads | 4 |
| FFN | 256 |
| Vocab | ~2,111 tokens (Hybrid Char + Word) |
| Norm | RMSNorm + QK-Norm |
| Position | RoPE |
| Activation | SwiGLU |
What made this one special
- Hybrid tokenizer -- word-level where it helps, character-level where it gets confused
- QK-Norm -- RMSNorm on queries and keys so training doesnt blow up
- Loss boosting -- yelled at the model extra hard when it ignored multi-character words
- Response-start weighting -- made it actually pay attention to the first tokens of its answers
- Pretrain replay -- kept mixing in pretrain data during SFT so it wouldnt forget how to speak English
Training curve
It went down. Slowly. Painfully.
Limitations
- Repeats itself. A lot.
- Knows almost nothing about the world.
- Not useful for anything real. Research only.
- Will embarrass itself if asked a direct question.
Built by CompactAI. We started somewhere.
- Downloads last month
- 11
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
