VOOZH about

URL: https://willitrunai.com/models/internlm-chat-7b

⇱ InternLM Chat 7B VRAM Requirements — GPU Compatibility


👁 InternLM
InternLM

InternLM Chat 7B

Legacy
👁 huggingface
HuggingFace
64.0KDownloads101LikesJul 2023Released8K tokensContextApache 2.0License50 GoodQuality

InternLM Chat 7B (7B parameters) requires approximately 13.9 GB of VRAM with Q4_K_M quantization. For the best balance of quality and speed, we recommend hardware with at least 16 GB of VRAM.

Get started

— copy & paste to run locally

Copy-paste commands to run InternLM Chat 7B on your machine.

Run

docker run --rm -it ghcr.io/ggerganov/llama.cpp:full \ --hf-repo "InternLM/InternLM-Chat-7B" \ --hf-file "InternLM-Chat-7B-Q4_K_M.gguf" \ -c 4096 -ngl 99

Quick specs

Parameters7B
Architecturedense
Context8K tokens
Modalitytext
Min RAM2.7 GB
Rec. RAM4.3 GB (Q4_K_M)
LicenseApache 2.0
FamilyInternLM
✓ Chat✓ Reasoning

About this model

InternLM has open-sourced a 7 billion parameter base model and a chat model tailored for practical scenarios. The model has the following characteristics: - It leverages trillions of high-quality tokens for training to establish a powerful knowledge base. - It supports an 8k context window length, enabling longer input sequences and stronger reasoning capabilities. - It provides a versatile toolset for users to flexibly build their own workflows.

  • It leverages trillions of high-quality tokens for training to establish a powerful knowledge base
  • It supports an 8k context window length, enabling longer input sequences and stronger reasoning capabilities
  • It provides a versatile toolset for users to flexibly build their own workflows

Related models

Your hardware

Detecting...

Quick picks

Best budgetA
RX 7600 XT 16GB~$329 — 39 tok/s
Best overallA
RX 7900 XT 20GB~$899 — 98 tok/s

Best hardware

Top picks for InternLM Chat 7B

RX 7900 XT 20GBA
20 GB
RTX A4500 20GBA
20 GB
RTX 3090 24GBA
24 GB
RTX 3090 Ti 24GBA
24 GB
RTX 4090 24GBA
24 GB

Run this model

InternLM Chat 7B on RX 7900 XT 20GBInternLM Chat 7B on RTX A4500 20GBInternLM Chat 7B on RTX 3090 24GB

Quantization options

VRAM estimates by quant level

No hardware detected — fit column shows raw VRAM estimates

QuantBitsVRAMQualityFit
Q2_K
2
2.7 GB
Low
Q3_K_S
3
3.4 GB
Low
NVFP4
4
3.9 GB
Medium
Q4_K_M
4
4.3 GB
Medium
Q5_K_M
5
5.0 GB
High
Q6_K
6
5.7 GB
High
Q8_0
8
7.5 GB
Very High
F16
16
14.3 GB
Maximum

Hardware compatibility

Fit estimates across all hardware

Open calculator

Computing compatibility...

Memory breakdown

Reference: RTX 2060 6GB

Weights4.3 GB
KV Cache7.8 GB
Runtime1.2 GB
Headroom0.6 GB

Frequently asked questions

FAQ — InternLM Chat 7B

See also

Quantization GuideScoring MethodologyVRAM Calculator