๐ NVIDIA
NVIDIA
Nemotron Cascade 2 30B A3B
FrontierNemotron Cascade 2 30B A3B (30B parameters) requires approximately 23.0 GB of VRAM with Q4_K_M quantization. As a Mixture of Experts model with 3B active parameters, it uses less memory than its total parameter count suggests. For the best balance of quality and speed, we recommend hardware with at least 27 GB of VRAM.
Get started
โ copy & paste to run locallyCopy-paste commands to run Nemotron Cascade 2 30B A3B on your machine.
Run
ollama run nemotron-cascade-2Quick specs
About this model
- โขMoE: 30B total / 3B active โ runs on single RTX 4090
- โขMamba-2 + Transformer hybrid architecture
- โขGold medal IMO 2025, IOI 2025, ICPC 2025
- โข92.4% AIME 2025, 87.2% LiveCodeBench v6
- โข262K context window
- โขThinking + instruct dual mode
Related models
Your hardware
Detecting...
Quick picks
Best hardware
Top picks for Nemotron Cascade 2 30B A3B
Run this model
Quantization options
VRAM estimates by quant level
No hardware detected โ fit column shows raw VRAM estimates
| Quant | Bits | VRAM | Quality | Fit |
|---|---|---|---|---|
Q2_K | 2 | 11.7 GB | Low | โ |
Q3_K_S | 3 | 14.7 GB | Low | โ |
NVFP4 | 4 | 16.8 GB | Medium | โ |
Q4_K_M | 4 | 18.3 GB | Medium | โ |
Q5_K_M | 5 | 21.6 GB | High | โ |
Q6_K | 6 | 24.6 GB | High | โ |
Q8_0 | 8 | 32.1 GB | Very High | โ |
F16 | 16 | 61.5 GB | Maximum | โ |
Quality benchmarks
Nemotron Cascade 2 30B A3B benchmark scores
Coding
Reasoning
Source: official ยท 2026-03-19
Hardware compatibility
Fit estimates across all hardware
Computing compatibility...
Memory breakdown
Reference: RTX 2060 6GB
Frequently asked questions
FAQ โ Nemotron Cascade 2 30B A3B
See also
