Local LLM Models and Their Max Context Windows: A Reference Table
By | Updated: October 17, 2025
Context window comparison of popular local LLMs: phi 4, DeepSeek R1, Qwen3 30B A3B, and Llama 4 Scout.
When choosing a local LLM, one of the first specifications to check is its context window. The context size determines how many tokens you can feed into the model at once, which directly affects practical use cases like long-form reasoning, document analysis, or multi-turn conversations. For hardware enthusiasts running quantized models on limited VRAM, knowing the context size helps plan memory usage, decide whether to split workloads across multiple GPUs, and balance performance-per-dollar.
The table below lists popular LLMs used in the local inference community, along with their parameter counts and maximum context windows.
Popular local LLMs and their max context size:
| Model Name | Parameters (billion) | Context window (thousand) |
|---|---|---|
| DeepSeek R1 | 671 | 128 |
| DeepSeek R1 Zero | 671 | 128 |
| DeepSeek V3 | 671 | 128 |
| DeepSeek V3.1 | 671 | 128 |
| gemma 3 270m | 0.3 | 32 |
| gemma 3 1b | 1 | 32 |
| gemma 3 4b | 4 | 128 |
| gemma 3 12b | 12 | 128 |
| gemma 3 27b | 27 | 128 |
| gemma 3n E2B | 6 | 32 |
| gemma 3n E4B | 8 | 128 |
| GLM 4 32B | 32 | 128 |
| GLM 4 9B | 9 | 128 |
| GLM 4.1V 9B | 9 | 128 |
| GLM 4.5 | 355 | 128 |
| GLM 4.5 Air | 106 | 128 |
| GLM Z1 32B | 32 | 128 |
| GLM Z1 9B | 9 | 128 |
| gpt oss 20b | 20 | 128 |
| gpt oss 120b | 120 | 128 |
| Kimi Dev | 72 | 128 |
| Kimi K2 | 1000 | 256 |
| Llama 3.2 1B | 1 | 128 |
| Llama 3.2 3B | 3 | 128 |
| Llama 3.2 11B | 11 | 128 |
| Llama 3.2 90B | 90 | 128 |
| Llama 3.3 70B | 70 | 128 |
| Llama 4 Scout 17B 16E | 109 | 10000 |
| Llama 4 Maverick 17B 128E | 400 | 1000 |
| Meta Llama 3.1 8B | 8 | 128 |
| Meta Llama 3.1 70B | 71 | 128 |
| Meta Llama 3.1 405B | 405 | 128 |
| Mistral Small 24B | 24 | 32 |
| Mistral Small 3.2 24B | 24 | 128 |
| Mistral Small 3.1 24B | 24 | 128 |
| Mistral Large 123B | 123 | 128 |
| phi 4 | 14 | 32 |
| phi 4 reasoning | 14 | 32 |
| Phi 4 reasoning plus | 14 | 64 |
| Phi 4 mini | 4 | 128 |
| QVQ 72B Preview | 73 | 32 |
| QwQ 32B | 32 | 128 |
| Qwen2.5 0.5B | 0.5 | 128 |
| Qwen2.5 1.5B | 2 | 128 |
| Qwen2.5 3B | 3 | 128 |
| Qwen2.5 7B | 8 | 128 |
| Qwen2.5 14B | 14 | 128 |
| Qwen2.5 Coder 0.5B | 0.5 | 128 |
| Qwen2.5 Coder 1.5B | 1.5 | 128 |
| Qwen2.5 Coder 3B | 3 | 128 |
| Qwen2.5 Coder 7B | 7 | 128 |
| Qwen2.5 Coder 14B | 14 | 128 |
| Qwen2.5 Coder 32B | 32 | 128 |
| Qwen2.5 VL 3B | 3 | 128 |
| Qwen2.5 VL 7B | 7 | 128 |
| Qwen2.5 VL 32B | 32 | 128 |
| Qwen2.5 VL 72B | 72 | 128 |
| Qwen3 0.6B | 0.6 | 32 |
| Qwen3 1.7B | 1.7 | 32 |
| Qwen3 4B | 4 | 32 |
| Qwen3 8B | 8 | 128 |
| Qwen3 14B | 14 | 128 |
| Qwen3 32B | 32 | 128 |
| Qwen3 30B A3B | 30 | 256 |
| Qwen3 235B A22B | 235 | 256 |
| Qwen3 Coder 30B A3B | 30 | 256 |
| Qwen3 Coder 480B A35B | 480 | 256 |
