VOOZH
about
URL: https://willitrunai.com/browse/best-for/6gb
⇱ Best AI Models for 6GB VRAM — Local LLMs | WillItRunAI
1
Qwen 3.5 4B
4B
Q4_K_M
6.1 GB VRAM
56.0 tok/s
Needs offload
S
Excellent
2
Phi-4 Mini Reasoning 4B
3.799999952316284B
Q4_K_M
5.3 GB VRAM
53.2 tok/s
Tight fit
S
Excellent
3
Jina Embeddings v3
0.5720000267028809B
F16
4.6 GB VRAM
8.0 tok/s
Runs great
S
Excellent
4
BGE M3
0.5680000185966492B
F16
3.8 GB VRAM
8.0 tok/s
Runs great
A
Great
5
mxbai Embed Large
0.33500000834465027B
F16
3.7 GB VRAM
4.7 tok/s
Runs great
A
Great
6
Snowflake Arctic Embed L
0.33500000834465027B
F16
3.7 GB VRAM
4.7 tok/s
Runs great
A
Great
7
Nomic Embed Text v1.5
0.13699999451637268B
F16
2.3 GB VRAM
2.0 tok/s
Runs great
A
Great
8
Qwen 3 4B
4B
Q4_K_M
6.1 GB VRAM
56.0 tok/s
Needs offload
A
Great
9
BGE Large EN v1.5
0.33500000834465027B
F16
3.7 GB VRAM
4.7 tok/s
Runs great
A
Great
10
All MiniLM L6 v2
0.023000000044703484B
F16
1.8 GB VRAM
2.0 tok/s
Runs great
B
Good
11
Qwen 2.5 Coder 3B
3B
Q4_K_M
5.5 GB VRAM
42.0 tok/s
Tight fit
A
Great
12
Codestral Mamba 7B
7B
Q4_K_M
6.3 GB VRAM
66.5 tok/s
Needs offload
A
Great
13
Gemma 4 E2B
5.099999904632568B
Q4_K_M
5.1 GB VRAM
71.4 tok/s
Tight fit
A
Great
14
Ministral 3 3B
3B
Q4_K_M
4.1 GB VRAM
42.0 tok/s
Runs great
A
Great
15
Qwen 3.5 2B
2B
Q4_K_M
4.4 GB VRAM
28.0 tok/s
Runs great
A
Great
16
Gemma 3 4B
4B
Q4_K_M
6.0 GB VRAM
56.0 tok/s
Needs offload
A
Great
17
Phi 4 Mini 4B
4B
Q4_K_M
5.4 GB VRAM
56.0 tok/s
Tight fit
A
Great
18
Qwen 3 1.7B
1.7000000476837158B
Q4_K_M
4.2 GB VRAM
23.8 tok/s
Runs great
A
Great
19
Qwen 2.5 Coder 1.5B
1.5B
Q4_K_M
2.8 GB VRAM
21.0 tok/s
Runs great
B
Good
20
Qwen 2.5 3B
3B
Q4_K_M
5.5 GB VRAM
42.0 tok/s
Tight fit
A
Great
21
Granite 4.1 3B
3B
Q4_K_M
4.6 GB VRAM
42.0 tok/s
Runs great
A
Great
22
Falcon 7B Instruct
7B
Q4_K_M
5.9 GB VRAM
92.6 tok/s
Needs offload
A
Great
23
Granite Code 3B
3B
Q4_K_M
5.8 GB VRAM
42.0 tok/s
Needs offload
B
Good
24
Llama 3.2 3B
3B
Q4_K_M
5.0 GB VRAM
42.0 tok/s
Tight fit
B
Good
25
TinyLlama 1.1B
1.100000023841858B
Q4_K_M
2.5 GB VRAM
15.4 tok/s
Runs great
B
Good
26
DeepSeek R1 1.5B
1.5B
Q4_K_M
2.8 GB VRAM
21.0 tok/s
Runs great
B
Good
27
Qwen 2.5 Coder 0.5B
0.5B
Q4_K_M
2.0 GB VRAM
7.0 tok/s
Runs great
C
Usable
28
Qwen 2.5 1.5B
1.5B
Q4_K_M
2.8 GB VRAM
21.0 tok/s
Runs great
B
Good
29
Gemma 3 1B
1B
Q4_K_M
2.5 GB VRAM
14.0 tok/s
Runs great
C
Usable
30
SmolLM3 3B
3B
Q4_K_M
5.3 GB VRAM
42.0 tok/s
Tight fit
B
Good
31
Qwen 3 0.6B
0.6000000238418579B
Q4_K_M
2.7 GB VRAM
8.4 tok/s
Runs great
C
Usable
32
Gemma 2 2B
2B
Q4_K_M
4.3 GB VRAM
28.0 tok/s
Runs great
B
Good
33
Llama 3.2 1B
1B
Q4_K_M
2.6 GB VRAM
14.0 tok/s
Runs great
C
Usable
34
Qwen 3.5 0.6B
0.6000000238418579B
Q4_K_M
2.7 GB VRAM
8.4 tok/s
Runs great
C
Usable
35
Qwen 2.5 0.5B
0.5B
Q4_K_M
2.0 GB VRAM
7.0 tok/s
Runs great
C
Usable
36
Nemotron Mini 4B
4B
Q4_K_M
5.9 GB VRAM
56.0 tok/s
Needs offload
C
Usable
37
gemma 2b
2B
Q4_K_M
3.0 GB VRAM
28.0 tok/s
Runs great
C
Usable
38
gemma 2 2b it
2B
Q6_K
3.4 GB VRAM
28.0 tok/s
Runs great
C
Usable
39
Llama 3.2 3B Instruct
3B
Q5_K_M
4.0 GB VRAM
42.0 tok/s
Runs great
C
Usable
40
Qwen3.5 4B
4B
Q4_K_M
4.4 GB VRAM
56.0 tok/s
Runs great
B
Good
41
Llama 3.2 1B Instruct Q8 0
1B
Q6_K
2.4 GB VRAM
14.0 tok/s
Runs great
C
Usable
42
Qwen2.5 3B Instruct
3B
Q4_K_M
3.7 GB VRAM
42.0 tok/s
Runs great
C
Usable
43
Qwen2.5 1.5B Instruct
1.5B
Q4_K_M
2.6 GB VRAM
21.0 tok/s
Runs great
C
Usable
44
SmolVLM 500M Instruct
0.5B
Q6_K
2.0 GB VRAM
7.0 tok/s
Runs great
C
Usable
45
TinyLlama 1.1B Chat v1.0
1.100000023841858B
Q4_K_M
2.3 GB VRAM
15.4 tok/s
Runs great
C
Usable
46
Gemmasutra Mini 2B v1
2B
Q4_K_M
3.0 GB VRAM
28.0 tok/s
Runs great
C
Usable
47
gemma 3 4b it
4B
Q4_K_M
4.4 GB VRAM
56.0 tok/s
Runs great
B
Good
48
embeddinggemma 300M
0.30000001192092896B
Q6_K
1.8 GB VRAM
4.2 tok/s
Runs great
D
Poor
49
DeepSeek R1 Distill Qwen 1.5B
1.5B
Q4_K_M
2.6 GB VRAM
21.0 tok/s
Runs great
C
Usable
50
gemma 3 4b it
4B
Q4_K_M
4.4 GB VRAM
56.0 tok/s
Runs great
B
Good
51
Yi Coder 1.5B Chat
1.5B
Q4_K_M
2.6 GB VRAM
21.0 tok/s
Runs great
C
Usable
52
Llama 3.2 1B Instruct
1B
Q4_K_M
2.2 GB VRAM
14.0 tok/s
Runs great
C
Usable
53
gemma 2 2b it
2B
Q4_K_M
3.0 GB VRAM
28.0 tok/s
Runs great
C
Usable
54
Llama 3.2 3B Instruct
3B
Q4_K_M
3.7 GB VRAM
42.0 tok/s
Runs great
C
Usable
55
gemma 3 1b it
1B
Q4_K_M
2.2 GB VRAM
14.0 tok/s
Runs great
C
Usable
56
Yi 1.5 6B Chat
6B
Q4_K_M
5.9 GB VRAM
84.0 tok/s
Needs offload
C
Usable
57
Ministral 3 3B Instruct 2512
3B
Q4_K_M
3.7 GB VRAM
42.0 tok/s
Runs great
C
Usable
58
stablelm 2 zephyr 1 6b
6B
Q4_K_M
5.9 GB VRAM
84.0 tok/s
Needs offload
C
Usable
59
TinyLlama 1.1B Chat v0.3
1.100000023841858B
Q4_K_M
2.3 GB VRAM
15.4 tok/s
Runs great
C
Usable
60
TinyLlama 1.1B Chat v0.6
1.100000023841858B
Q4_K_M
2.3 GB VRAM
15.4 tok/s
Runs great
C
Usable
61
HELVETE 3B
3B
Q4_K_M
3.7 GB VRAM
42.0 tok/s
Runs great
C
Usable
62
granite embedding 107m multilingual
0.10700000077486038B
Q4_K_M
1.7 GB VRAM
2.0 tok/s
Runs great
D
Poor
63
Hermes 3 Llama 3.2 3B
3B
Q4_K_M
3.7 GB VRAM
42.0 tok/s
Runs great
C
Usable
64
stablelm zephyr 3b
3B
Q4_K_M
3.7 GB VRAM
42.0 tok/s
Runs great
C
Usable
65
Yi 1.5 6B Chat
6B
Q4_K_M
5.9 GB VRAM
84.0 tok/s
Needs offload
C
Usable
66
EXAONE 4.0 1.2B
1.2000000476837158B
Q4_K_M
2.4 GB VRAM
16.8 tok/s
Runs great
C
Usable
67
StarCoder2 3B
3B
Q4_K_M
3.7 GB VRAM
42.0 tok/s
Runs great
C
Usable
68
stablelm 2 1 6b chat imatrix
6B
Q4_K_M
5.9 GB VRAM
84.0 tok/s
Needs offload
C
Usable
69
EXAONE 3.5 2.4B Instruct
2.4000000953674316B
Q4_K_M
3.2 GB VRAM
33.6 tok/s
Runs great
C
Usable
70
Yi 1.5 6B
6B
Q4_K_M
6.1 GB VRAM
76.5 tok/s
Needs offload
C
Usable
71
Yi Coder 1.5B
1.5B
Q4_K_M
2.6 GB VRAM
21.0 tok/s
Runs great
C
Usable
72
Falcon H1 Tiny 90M Instruct
0.09000000357627869B
Q4_K_M
1.7 GB VRAM
2.0 tok/s
Runs great
D
Poor
73
AI21 Jamba Reasoning 3B
3B
Q4_K_M
3.7 GB VRAM
42.0 tok/s
Runs great
C
Usable
74
Falcon3 1B Instruct abliterated
1B
Q4_K_M
2.2 GB VRAM
14.0 tok/s
Runs great
C
Usable
75
stablelm 3b 4e1t
3B
Q4_K_M
3.7 GB VRAM
42.0 tok/s
Runs great
C
Usable
76
ai21labs AI21 Jamba Reasoning 3B
3B
Q4_K_M
3.7 GB VRAM
42.0 tok/s
Runs great
C
Usable
77
stablelm 2 zephyr 1.6b
1.600000023841858B
Q4_K_M
2.7 GB VRAM
22.4 tok/s
Runs great
C
Usable
78
Falcon H1 1.5B Instruct
1.5B
Q4_K_M
2.6 GB VRAM
21.0 tok/s
Runs great
C
Usable
79
logos16v2 stablelm2 1.6b i1
1.600000023841858B
Q4_K_M
2.7 GB VRAM
22.4 tok/s
Runs great
C
Usable
80
ai21labs AI21 Jamba2 3B
3B
Q4_K_M
3.7 GB VRAM
42.0 tok/s
Runs great
C
Usable
81
TinyLlama 1.1B Chat v1.0 imatrix
1.100000023841858B
Q4_K_M
2.3 GB VRAM
15.4 tok/s
Runs great
C
Usable
82
HelpingAI2 6B
6B
Q4_K_M
5.9 GB VRAM
84.0 tok/s
Needs offload
C
Usable
83
HelpingAI2.5 5B i1
5B
Q4_K_M
5.1 GB VRAM
70.0 tok/s
Tight fit
C
Usable
84
HelpingAI 3B hindi i1
3B
Q4_K_M
3.7 GB VRAM
42.0 tok/s
Runs great
C
Usable
85
HelpingAI2 6B i1
6B
Q4_K_M
5.9 GB VRAM
84.0 tok/s
Needs offload
C
Usable
86
AI21 Jamba2 3B
3B
Q4_K_M
3.7 GB VRAM
42.0 tok/s
Runs great
C
Usable
87
HelpingAI 3B hindi
3B
Q4_K_M
3.7 GB VRAM
42.0 tok/s
Runs great
C
Usable
88
AI21 Jamba2 3B i1
3B
Q4_K_M
3.7 GB VRAM
42.0 tok/s
Runs great
C
Usable
89
StarCoder2 7B
7B
Q4_K_M
6.3 GB VRAM
63.2 tok/s
Needs offload
C
Usable
90
StarCoder2 3B
3B
Q4_K_M
3.8 GB VRAM
42.0 tok/s
Runs great
C
Usable