VOOZH about

URL: https://willitrunai.com/models/tinyllama-1.1b

โ‡ฑ TinyLlama 1.1B VRAM Requirements โ€” GPU Compatibility


๐Ÿ‘ TinyLlama
TinyLlama

TinyLlama 1.1B

Legacy
๐Ÿ‘ huggingface
HuggingFace๐Ÿ‘ ollama
Ollama
2.2MDownloads1.6KLikesDec 2023Released4K tokensContextApache 2.0License30 BasicQuality

TinyLlama 1.1B (1.100000023841858B parameters) requires approximately 2.8 GB of VRAM with Q4_K_M quantization. For the best balance of quality and speed, we recommend hardware with at least 4 GB of VRAM.

Get started

โ€” copy & paste to run locally

Copy-paste commands to run TinyLlama 1.1B on your machine.

Run

ollama run tinyllama

Quick specs

Parameters1.1B
Architecturedense
Context4K tokens
Modalitytext
Min RAM0.4 GB
Rec. RAM0.7 GB (Q4_K_M)
LicenseApache 2.0
FamilyTinyLlama
โœ“ Chat

About this model

The TinyLlama project aims to pretrain a 1.1B Llama model on 3 trillion tokens. With some proper optimization, we can achieve this within a span of "just" 90 days using 16 A100-40G GPUs ๐Ÿš€๐Ÿš€. The training has started on 2023-09-01.

Your hardware

Detecting...

Quick picks

๐Ÿ‘ Intel
Best budgetB
Intel Arc A380 6GB~$139 โ€” 15 tok/s
๐Ÿ‘ NVIDIA
Best overallB
GTX 1650 4GB~$149 โ€” 15 tok/s

Best hardware

Top picks for TinyLlama 1.1B

GTX 1650 4GBB
4 GB
RTX 3050 Ti Laptop 4GBB
4 GB
Intel Arc A370M 4GBB
4 GB
RTX 2060 6GBB
6 GB
RTX 4050 Laptop 6GBB
6 GB

Run this model

TinyLlama 1.1B on GTX 1650 4GBTinyLlama 1.1B on RTX 3050 Ti Laptop 4GBTinyLlama 1.1B on Intel Arc A370M 4GB

Quantization options

VRAM estimates by quant level

No hardware detected โ€” fit column shows raw VRAM estimates

QuantBitsVRAMQualityFit
Q2_K
2
0.4 GB
Lowโ€”
Q3_K_S
3
0.5 GB
Lowโ€”
NVFP4
4
0.6 GB
Mediumโ€”
Q4_K_M
4
0.7 GB
Mediumโ€”
Q5_K_M
5
0.8 GB
Highโ€”
Q6_K
6
0.9 GB
Highโ€”
Q8_0
8
1.2 GB
Very Highโ€”
F16
16
2.3 GB
Maximumโ€”

Quality benchmarks

TinyLlama 1.1B benchmark scores

Benchmark verified

Reasoning

MMLU-Pro1.1%
GPQA Diamondโ€”
MATH-5001.5%
ARC Challenge33.9%

General

Chatbot Arenaโ€”
IFEval6.0%

Source: community ยท 2024-01-08

Hardware compatibility

Fit estimates across all hardware

Open calculator

Computing compatibility...

Memory breakdown

Reference: RTX 2060 6GB

Weights0.7 GB
KV Cache0.3 GB
Runtime1.2 GB
Headroom0.6 GB

Frequently asked questions

FAQ โ€” TinyLlama 1.1B

See also

Quantization GuideScoring MethodologyVRAM Calculator