GPT-2 ChatML GGUF (no_robots SFT)
This repository contains GGUF quantized models converted from the fine-tuned JustACluelessKid2/gpt2-chatml-fp32.
Models Available
gpt2-f32.gguf(252.5 MB) - Baseline F16-Embedding GGUFggml-model-Q8_0.gguf(136.7 MB) - High-fidelity 8-bit quantizationggml-model-IQ4_NL.gguf(84.8 MB) - Highly-optimized 4-bit non-linear quantizationggml-model-IQ4_XS.gguf(82.2 MB) - Imatrix optimized 4-bit quantizationggml-model-Q6_K.gguf(106.7 MB) - High-quality 6-bit quantizationggml-model-Q5_K_M.gguf(98.8 MB) - High-quality 5-bit quantizationggml-model-IQ3_XXS.gguf(64.8 MB) - Imatrix 3-bit quantization (Chromebook-compatible)ggml-model-IQ2_M.gguf(62.5 MB) - Imatrix optimized 2-bit quantizationggml-model-IQ2_XXS.gguf(55.5 MB) - Ultra-low 2-bit quantization
These models were calibrated using an importance matrix computed on 1,000 shuffled conversational sequences.
- Downloads last month
- 2,219
GGUF
Model size
0.1B params
Architecture
gpt2
Hardware compatibility
Log In to add your hardware
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for JustACluelessKid2/gpt2-chatml-fp32-GGUF
Base model
openai-community/gpt2 Finetuned
JustACluelessKid2/gpt2-chatml-fp32