Voozh

This repository contains quantized versions of BramVanroy/fietje-2-chat.

Available quantization types and expected performance differences compared to base f16, higher perplexity=worse (from llama.cpp):

Q3_K_M : 3.07G, +0.2496 ppl @ LLaMA-v1-7B
Q4_K_M : 3.80G, +0.0532 ppl @ LLaMA-v1-7B
Q5_K_M : 4.45G, +0.0122 ppl @ LLaMA-v1-7B
Q6_K : 5.15G, +0.0008 ppl @ LLaMA-v1-7B
Q8_0 : 6.70G, +0.0004 ppl @ LLaMA-v1-7B
F16 : 13.00G @ 7B

Also available on ollama.

Quants were made with release b2777 of llama.cpp.

Downloads last month: 114

GGUF

Model size

3B params

Architecture

phi2

Hardware compatibility

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including BramVanroy/fietje-2-chat-gguf

An open and efficient LLM for Dutch based on phi-2 • 7 items • Updated 21 days ago

URL: https://huggingface.co/BramVanroy/fietje-2-chat-gguf

⇱ BramVanroy/fietje-2-chat-gguf · Hugging Face

Collection including BramVanroy/fietje-2-chat-gguf