VOOZH
about
URL: https://willitrunai.com/browse/best-for/4gb
⇱ Best AI Models for 4GB VRAM — Local LLMs | WillItRunAI
1
BGE M3
0.5680000185966492B
F16
3.6 GB VRAM
8.0 tok/s
Tight fit
A
Great
2
mxbai Embed Large
0.33500000834465027B
F16
3.5 GB VRAM
4.7 tok/s
Tight fit
A
Great
3
Snowflake Arctic Embed L
0.33500000834465027B
F16
3.5 GB VRAM
4.7 tok/s
Tight fit
A
Great
4
Nomic Embed Text v1.5
0.13699999451637268B
F16
2.1 GB VRAM
2.0 tok/s
Runs great
A
Great
5
BGE Large EN v1.5
0.33500000834465027B
F16
3.5 GB VRAM
4.7 tok/s
Tight fit
A
Great
6
All MiniLM L6 v2
0.023000000044703484B
F16
1.6 GB VRAM
2.0 tok/s
Runs great
B
Good
7
Ministral 3 3B
3B
Q4_K_M
3.9 GB VRAM
42.0 tok/s
Needs offload
A
Great
8
Qwen 3.5 2B
2B
Q4_K_M
4.2 GB VRAM
28.0 tok/s
Needs offload
B
Good
9
Qwen 3 1.7B
1.7000000476837158B
Q4_K_M
4.0 GB VRAM
23.8 tok/s
Needs offload
B
Good
10
Qwen 2.5 Coder 1.5B
1.5B
Q4_K_M
2.6 GB VRAM
21.0 tok/s
Runs great
B
Good
11
TinyLlama 1.1B
1.100000023841858B
Q4_K_M
2.3 GB VRAM
15.4 tok/s
Runs great
B
Good
12
DeepSeek R1 1.5B
1.5B
Q4_K_M
2.6 GB VRAM
21.0 tok/s
Runs great
B
Good
13
Qwen 2.5 Coder 0.5B
0.5B
Q4_K_M
1.8 GB VRAM
7.0 tok/s
Runs great
C
Usable
14
Qwen 2.5 1.5B
1.5B
Q4_K_M
2.6 GB VRAM
21.0 tok/s
Runs great
B
Good
15
Gemma 3 1B
1B
Q4_K_M
2.3 GB VRAM
14.0 tok/s
Runs great
B
Good
16
Qwen 3 0.6B
0.6000000238418579B
Q4_K_M
2.5 GB VRAM
8.4 tok/s
Runs great
C
Usable
17
Gemma 2 2B
2B
Q4_K_M
4.1 GB VRAM
28.0 tok/s
Needs offload
C
Usable
18
Llama 3.2 1B
1B
Q4_K_M
2.4 GB VRAM
14.0 tok/s
Runs great
C
Usable
19
Qwen 3.5 0.6B
0.6000000238418579B
Q4_K_M
2.5 GB VRAM
8.4 tok/s
Runs great
C
Usable
20
Qwen 2.5 0.5B
0.5B
Q4_K_M
1.8 GB VRAM
7.0 tok/s
Runs great
C
Usable
21
gemma 2b
2B
Q4_K_M
2.8 GB VRAM
28.0 tok/s
Runs great
C
Usable
22
gemma 2 2b it
2B
Q6_K
3.2 GB VRAM
28.0 tok/s
Runs great
C
Usable
23
Llama 3.2 3B Instruct
3B
Q5_K_M
3.8 GB VRAM
42.0 tok/s
Needs offload
C
Usable
24
Qwen3.5 4B
4B
Q4_K_M
4.2 GB VRAM
56.0 tok/s
Needs offload
C
Usable
25
Llama 3.2 1B Instruct Q8 0
1B
Q6_K
2.2 GB VRAM
14.0 tok/s
Runs great
C
Usable
26
Qwen2.5 3B Instruct
3B
Q4_K_M
3.5 GB VRAM
42.0 tok/s
Tight fit
C
Usable
27
Qwen2.5 1.5B Instruct
1.5B
Q4_K_M
2.4 GB VRAM
21.0 tok/s
Runs great
C
Usable
28
SmolVLM 500M Instruct
0.5B
Q6_K
1.8 GB VRAM
7.0 tok/s
Runs great
C
Usable
29
TinyLlama 1.1B Chat v1.0
1.100000023841858B
Q4_K_M
2.1 GB VRAM
15.4 tok/s
Runs great
C
Usable
30
Gemmasutra Mini 2B v1
2B
Q4_K_M
2.8 GB VRAM
28.0 tok/s
Runs great
C
Usable
31
gemma 3 4b it
4B
Q4_K_M
4.2 GB VRAM
56.0 tok/s
Needs offload
C
Usable
32
embeddinggemma 300M
0.30000001192092896B
Q6_K
1.6 GB VRAM
4.2 tok/s
Runs great
C
Usable
33
DeepSeek R1 Distill Qwen 1.5B
1.5B
Q4_K_M
2.4 GB VRAM
21.0 tok/s
Runs great
C
Usable
34
gemma 3 4b it
4B
Q4_K_M
4.2 GB VRAM
56.0 tok/s
Needs offload
C
Usable
35
Yi Coder 1.5B Chat
1.5B
Q4_K_M
2.4 GB VRAM
21.0 tok/s
Runs great
C
Usable
36
Llama 3.2 1B Instruct
1B
Q4_K_M
2.0 GB VRAM
14.0 tok/s
Runs great
C
Usable
37
gemma 2 2b it
2B
Q4_K_M
2.8 GB VRAM
28.0 tok/s
Runs great
C
Usable
38
Llama 3.2 3B Instruct
3B
Q4_K_M
3.5 GB VRAM
42.0 tok/s
Tight fit
C
Usable
39
gemma 3 1b it
1B
Q4_K_M
2.0 GB VRAM
14.0 tok/s
Runs great
C
Usable
40
Ministral 3 3B Instruct 2512
3B
Q4_K_M
3.5 GB VRAM
42.0 tok/s
Tight fit
C
Usable
41
TinyLlama 1.1B Chat v0.3
1.100000023841858B
Q4_K_M
2.1 GB VRAM
15.4 tok/s
Runs great
C
Usable
42
TinyLlama 1.1B Chat v0.6
1.100000023841858B
Q4_K_M
2.1 GB VRAM
15.4 tok/s
Runs great
C
Usable
43
HELVETE 3B
3B
Q4_K_M
3.5 GB VRAM
42.0 tok/s
Tight fit
C
Usable
44
granite embedding 107m multilingual
0.10700000077486038B
Q4_K_M
1.5 GB VRAM
2.0 tok/s
Runs great
D
Poor
45
Hermes 3 Llama 3.2 3B
3B
Q4_K_M
3.5 GB VRAM
42.0 tok/s
Tight fit
C
Usable
46
stablelm zephyr 3b
3B
Q4_K_M
3.5 GB VRAM
42.0 tok/s
Tight fit
C
Usable
47
EXAONE 4.0 1.2B
1.2000000476837158B
Q4_K_M
2.2 GB VRAM
16.8 tok/s
Runs great
C
Usable
48
StarCoder2 3B
3B
Q4_K_M
3.5 GB VRAM
42.0 tok/s
Tight fit
C
Usable
49
EXAONE 3.5 2.4B Instruct
2.4000000953674316B
Q4_K_M
3.0 GB VRAM
33.6 tok/s
Runs great
C
Usable
50
Yi Coder 1.5B
1.5B
Q4_K_M
2.4 GB VRAM
21.0 tok/s
Runs great
C
Usable
51
Falcon H1 Tiny 90M Instruct
0.09000000357627869B
Q4_K_M
1.5 GB VRAM
2.0 tok/s
Runs great
D
Poor
52
AI21 Jamba Reasoning 3B
3B
Q4_K_M
3.5 GB VRAM
42.0 tok/s
Tight fit
C
Usable
53
Falcon3 1B Instruct abliterated
1B
Q4_K_M
2.0 GB VRAM
14.0 tok/s
Runs great
C
Usable
54
stablelm 3b 4e1t
3B
Q4_K_M
3.5 GB VRAM
42.0 tok/s
Tight fit
C
Usable
55
ai21labs AI21 Jamba Reasoning 3B
3B
Q4_K_M
3.5 GB VRAM
42.0 tok/s
Tight fit
C
Usable
56
stablelm 2 zephyr 1.6b
1.600000023841858B
Q4_K_M
2.5 GB VRAM
22.4 tok/s
Runs great
C
Usable
57
Falcon H1 1.5B Instruct
1.5B
Q4_K_M
2.4 GB VRAM
21.0 tok/s
Runs great
C
Usable
58
logos16v2 stablelm2 1.6b i1
1.600000023841858B
Q4_K_M
2.5 GB VRAM
22.4 tok/s
Runs great
C
Usable
59
ai21labs AI21 Jamba2 3B
3B
Q4_K_M
3.5 GB VRAM
42.0 tok/s
Tight fit
C
Usable
60
TinyLlama 1.1B Chat v1.0 imatrix
1.100000023841858B
Q4_K_M
2.1 GB VRAM
15.4 tok/s
Runs great
C
Usable
61
HelpingAI 3B hindi i1
3B
Q4_K_M
3.5 GB VRAM
42.0 tok/s
Tight fit
C
Usable
62
AI21 Jamba2 3B
3B
Q4_K_M
3.5 GB VRAM
42.0 tok/s
Tight fit
C
Usable
63
HelpingAI 3B hindi
3B
Q4_K_M
3.5 GB VRAM
42.0 tok/s
Tight fit
C
Usable
64
AI21 Jamba2 3B i1
3B
Q4_K_M
3.5 GB VRAM
42.0 tok/s
Tight fit
C
Usable
65
StarCoder2 3B
3B
Q4_K_M
3.6 GB VRAM
42.0 tok/s
Tight fit
C
Usable