VOOZH about

URL: https://willitrunai.com/models/olmo-2-32b

⇱ OLMo 2 32B VRAM Requirements β€” GPU Compatibility


πŸ‘ Allen AI
Allen AI

OLMo 2 32B

πŸ‘ huggingface
HuggingFace
5.1KDownloads147LikesMar 2025Released4K tokensContextApache 2.0License76 StrongQuality

OLMo 2 32B (32B parameters) requires approximately 25.2 GB of VRAM with Q4_K_M quantization. For the best balance of quality and speed, we recommend hardware with at least 29 GB of VRAM.

Get started

β€” copy & paste to run locally

Copy-paste commands to run OLMo 2 32B on your machine.

Run

lms load OLMo-2-0325-32B-Instruct && lms server start

Quick specs

Parameters32B
Architecturedense
Context4K tokens
Modalitytext
Min RAM12.5 GB
Rec. RAM19.5 GB (Q4_K_M)
LicenseApache 2.0
FamilyOLMo
βœ“ Chat

About this model

OLMo 2 32B is Allen AI's fully open 32B-parameter language model, the largest in the OLMo 2 family. Trained on 6T tokens from the Dolma dataset, post-trained with TΓΌlu 3 SFT, DPO, and RLVR. First fully open model to outperform GPT-3.5 and GPT-4o mini on academic benchmarks.

  • β€’First fully open model to outperform GPT-3.5 and GPT-4o mini
  • β€’Fully open: weights, data, code, and training recipes
  • β€’Post-trained with SFT, DPO, and Reinforcement Learning from Verifiable Rewards
  • β€’Trained on 6T tokens from the Dolma dataset

Related models

Your hardware

Detecting...

Quick picks

Best budgetA
Mac mini M4 64GB~$1,099 β€” 9 tok/s
πŸ‘ NVIDIA
Best overallS
NVIDIA A100 40GB~$10,000 β€” 72 tok/s

Best hardware

Top picks for OLMo 2 32B

NVIDIA A100 40GBS
40 GB
RTX PRO 5000 Blackwell 48GBS
48 GB
MacBook Pro M4 Max 64GBS
64 GB
RTX 5090 32GBS
32 GB
RTX 6000 Ada 48GBA
48 GB

Run this model

OLMo 2 32B on NVIDIA A100 40GBOLMo 2 32B on RTX PRO 5000 Blackwell 48GBOLMo 2 32B on MacBook Pro M4 Max 64GB

Quantization options

VRAM estimates by quant level

No hardware detected β€” fit column shows raw VRAM estimates

QuantBitsVRAMQualityFit
Q2_K
2
12.5 GB
Lowβ€”
Q3_K_S
3
15.7 GB
Lowβ€”
NVFP4
4
17.9 GB
Mediumβ€”
Q4_K_M
4
19.5 GB
Mediumβ€”
Q5_K_M
5
23.0 GB
Highβ€”
Q6_K
6
26.2 GB
Highβ€”
Q8_0
8
34.2 GB
Very Highβ€”
F16
16
65.6 GB
Maximumβ€”

Quality benchmarks

OLMo 2 32B benchmark scores

Benchmark verified

General

Chatbot Arenaβ€”
IFEval85.6%

Source: official Β· 2025-03-25

Hardware compatibility

Fit estimates across all hardware

Open calculator

Computing compatibility...

Memory breakdown

Reference: RTX 2060 6GB

Weights19.5 GB
KV Cache3.9 GB
Runtime1.2 GB
Headroom0.6 GB

Frequently asked questions

FAQ β€” OLMo 2 32B

See also

Quantization GuideScoring MethodologyVRAM Calculator