Nemotron Nano 9B v2

Frontier

👁 huggingface
HuggingFace 👁 ollama
Ollama

Jun 2025Released131K tokensContextNVIDIA Open ModelLicense70 GoodQuality

Nemotron Nano 9B v2 (9B parameters) requires approximately 9.7 GB of VRAM with Q4_K_M quantization. For the best balance of quality and speed, we recommend hardware with at least 12 GB of VRAM.

Get started

— copy & paste to run locally

Copy-paste commands to run Nemotron Nano 9B v2 on your machine.

Run

ollama run nemotron-nano:9b-v2

Quick specs

Parameters9B

Architecturedense

Context131K tokens

Modalitytext

Min RAM3.5 GB

Rec. RAM5.5 GB (Q4_K_M)

LicenseNVIDIA Open Model

FamilyNemotron

✓ Code✓ Chat✓ Reasoning

About this model

Nemotron Nano 9B v2 is an updated version of NVIDIA's compact reasoning model with improved instruction following, coding, and math capabilities.

•Improved reasoning and coding over v1
•Switchable thinking mode for detailed step-by-step reasoning
•Fits comfortably on 8 GB VRAM GPUs at Q4_K_M

Related models

Your hardware

Detecting...

Quick picks

👁 Intel

Best budgetA

Intel Arc B580 12GB~$249 — 43 tok/s

👁 NVIDIA

Best overallS

RTX 4070 Ti Super 16GB~$799 — 105 tok/s

Best hardware

Top picks for Nemotron Nano 9B v2

👁 NVIDIA

RTX 4070 Ti Super 16GBS

16 GB

👁 NVIDIA

RTX 4080 Super 16GBS

16 GB

👁 NVIDIA

RTX 5070 Ti 16GBS

16 GB

👁 NVIDIA

RTX 5080 16GBS

16 GB

👁 NVIDIA

RTX 5080 Laptop 16GBS

16 GB

Run this model

Nemotron Nano 9B v2 on RTX 4070 Ti Super 16GB Nemotron Nano 9B v2 on RTX 4080 Super 16GB Nemotron Nano 9B v2 on RTX 5070 Ti 16GB

Quantization options

VRAM estimates by quant level

No hardware detected — fit column shows raw VRAM estimates

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	3.5 GB	Low	—
Q3_K_S	3	4.4 GB	Low	—
NVFP4	4	5.0 GB	Medium	—
Q4_K_M	4	5.5 GB	Medium	—
Q5_K_M	5	6.5 GB	High	—
Q6_K	6	7.4 GB	High	—
Q8_0	8	9.6 GB	Very High	—
F16	16	18.5 GB	Maximum	—

Quality benchmarks