VOOZH about

URL: https://willitrunai.com/models/nemotron-70b

⇱ Nemotron 70B VRAM Requirements β€” GPU Compatibility


πŸ‘ NVIDIA
NVIDIA

Nemotron 70B

Current
πŸ‘ huggingface
HuggingFaceπŸ‘ ollama
Ollama
103Downloads568LikesOct 2024Released131K tokensContextNVIDIA Open ModelLicense52 GoodQuality

Nemotron 70B (70B parameters) requires approximately 49.1 GB of VRAM with Q4_K_M quantization. For the best balance of quality and speed, we recommend hardware with at least 57 GB of VRAM.

Get started

β€” copy & paste to run locally

Copy-paste commands to run Nemotron 70B on your machine.

Run

ollama run nemotron

Quick specs

Parameters70B
Architecturedense
Context131K tokens
Modalitytext
Min RAM27.3 GB
Rec. RAM42.7 GB (Q4_K_M)
LicenseNVIDIA Open Model
FamilyNemotron
βœ“ Chatβœ“ Reasoning

About this model

Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA to improve the helpfulness of LLM generated responses to user queries.

  • β€’Please sign up to get free and immediate access to NVIDIA NeMo Framework container. If you don’t have an NVIDIA NGC account, you will be...
  • β€’If you don’t have an NVIDIA NGC API key, sign into NVIDIA NGC, selecting organization/team: ea-bignlp/ga-participants and click Generate API key....
  • β€’On your machine, docker login to nvcr.io using

Related models

Your hardware

Detecting...

Quick picks

Best budgetA
MacBook Pro M4 Max 96GB~$2,499 β€” 17 tok/s
πŸ‘ NVIDIA
Best overallA
NVIDIA H100 80GB~$40,000 β€” 72 tok/s

Best hardware

Top picks for Nemotron 70B

NVIDIA H100 80GBA
80 GB
NVIDIA H800 80GBA
80 GB
NVIDIA GH200 96GBA
96 GB
NVIDIA H20 96GBA
96 GB
NVIDIA A100 80GBA
80 GB

Run this model

Nemotron 70B on NVIDIA H100 80GBNemotron 70B on NVIDIA H800 80GBNemotron 70B on NVIDIA GH200 96GB

Quantization options

VRAM estimates by quant level

No hardware detected β€” fit column shows raw VRAM estimates

QuantBitsVRAMQualityFit
Q2_K
2
27.3 GB
Lowβ€”
Q3_K_S
3
34.3 GB
Lowβ€”
NVFP4
4
39.2 GB
Mediumβ€”
Q4_K_M
4
42.7 GB
Mediumβ€”
Q5_K_M
5
50.4 GB
Highβ€”
Q6_K
6
57.4 GB
Highβ€”
Q8_0
8
74.9 GB
Very Highβ€”
F16
16
143.5 GB
Maximumβ€”

Quality benchmarks

Nemotron 70B benchmark scores

Benchmark verified

Reasoning

MMLU-Pro85.2%
GPQA Diamond1.1%
MATH-50042.7%
ARC Challengeβ€”

General

Chatbot Arenaβ€”
IFEval73.8%

Source: community Β· 2024-10-16

Hardware compatibility

Fit estimates across all hardware

Open calculator

Computing compatibility...

Memory breakdown

Reference: RTX 2060 6GB

Weights42.7 GB
KV Cache4.9 GB
Runtime0.9 GB
Headroom0.6 GB

Frequently asked questions

FAQ β€” Nemotron 70B

See also

Quantization GuideScoring MethodologyVRAM Calculator