VOOZH about

URL: https://willitrunai.com/models/mpt-30b-instruct

⇱ MPT-30B-Instruct VRAM Requirements — GPU Compatibility


👁 MosaicML
MosaicML

MPT-30B-Instruct

Legacy
👁 huggingface
HuggingFace
May 2023Released8K tokensContextApache 2.0License50 GoodQuality

MPT-30B-Instruct (30B parameters) requires approximately 46.8 GB of VRAM with Q5_K_M quantization. For the best balance of quality and speed, we recommend hardware with at least 54 GB of VRAM.

Get started

— copy & paste to run locally

Copy-paste commands to run MPT-30B-Instruct on your machine.

Run

docker run --rm -it ghcr.io/ggerganov/llama.cpp:full \ --hf-repo "mosaicml/mpt-30b-instruct" \ --hf-file "mpt-30b-instruct-Q5_K_M.gguf" \ -c 4096 -ngl 99

Quick specs

Parameters30B
Architecturedense
Context8K tokens
Modalitytext
Min RAM11.7 GB
Rec. RAM21.6 GB (Q5_K_M)
LicenseApache 2.0
FamilyMPT
✓ Chat✓ Reasoning

About this model

MPT-30B Instruct is MosaicML's large instruction-tuned model offering strong reasoning and generation quality. Features 8K context with ALiBi encoding and efficient inference optimizations.

Related models

Your hardware

Detecting...

Quick picks

Best budgetA
MacBook Pro M4 Max 96GB~$2,499 — 28 tok/s
👁 NVIDIA
Best overallA
NVIDIA H100 80GB~$40,000 — 133 tok/s

Best hardware

Top picks for MPT-30B-Instruct

NVIDIA H100 80GBA
80 GB
NVIDIA H800 80GBA
80 GB
NVIDIA A100 80GBA
80 GB
NVIDIA H100 PCIe 80GBA
80 GB
NVIDIA A800 80GBA
80 GB

Run this model

MPT-30B-Instruct on NVIDIA H100 80GBMPT-30B-Instruct on NVIDIA H800 80GBMPT-30B-Instruct on NVIDIA A100 80GB

Quantization options

VRAM estimates by quant level

No hardware detected — fit column shows raw VRAM estimates

QuantBitsVRAMQualityFit
Q2_K
2
11.7 GB
Low
Q3_K_S
3
14.7 GB
Low
NVFP4
4
16.8 GB
Medium
Q4_K_M
4
18.3 GB
Medium
Q5_K_M
5
21.6 GB
High
Q6_K
6
24.6 GB
High
Q8_0
8
32.1 GB
Very High
F16
16
61.5 GB
Maximum

Hardware compatibility

Fit estimates across all hardware

Open calculator

Computing compatibility...

Memory breakdown

Reference: RTX 2060 6GB

Weights21.6 GB
KV Cache23.4 GB
Runtime1.2 GB
Headroom0.6 GB

Frequently asked questions

FAQ — MPT-30B-Instruct

See also

Quantization GuideScoring MethodologyVRAM Calculator