my GPT-2-like model, pretrained from scratch • 1 item • Updated
A newer version of this model is available: aquiffoo/neo-3-1B-A90M-Instruct
neo
neo is our first pretrained model, featuring 64.1 million parameters. Designed purely as an experiment, it currently does not yet offer coherent text and reasoning at all.
Model Overview
- Name:
neo - Parameters: 64.1 million
- Architecture: Dense
- Type: General-purpose LLM
- Hosted on: Hugging Face
Training Steps
step 500 | loss = 0.9147
step 1000 | loss = 0.7440
step 1500 | loss = 0.6791
step 2000 | loss = 0.6631
step 2500 | loss = 0.6439
step 3000 | loss = 0.6335
step 3500 | loss = 0.6176
step 4000 | loss = 0.5987
step 4500 | loss = 0.5979
step 5000 | loss = 0.6018
step 5500 | loss = 0.5767
step 6000 | loss = 0.5839
step 6500 | loss = 0.5754
step 7000 | loss = 0.5644
step 7500 | loss = 0.5640
step 8000 | loss = 0.5686
- Downloads last month
- 4
Safetensors
Model size
64.1M params
Tensor type
F32
·
Model tree for aquiffoo/neo-64M-C1
Quantizations
1 model