InternLM Chat 7B

Legacy

64.0KDownloads101LikesJul 2023Released8K tokensContextApache 2.0License50 GoodQuality

InternLM Chat 7B (7B parameters) requires approximately 13.9 GB of VRAM with Q4_K_M quantization. For the best balance of quality and speed, we recommend hardware with at least 16 GB of VRAM.

Get started

— copy & paste to run locally

Copy-paste commands to run InternLM Chat 7B on your machine.

Run

docker run --rm -it ghcr.io/ggerganov/llama.cpp:full \
 --hf-repo "InternLM/InternLM-Chat-7B" \
 --hf-file "InternLM-Chat-7B-Q4_K_M.gguf" \
 -c 4096 -ngl 99

Quick specs

Parameters7B

Architecturedense

Context8K tokens

Modalitytext

Min RAM2.7 GB

Rec. RAM4.3 GB (Q4_K_M)

LicenseApache 2.0

FamilyInternLM

✓ Chat✓ Reasoning

About this model

InternLM has open-sourced a 7 billion parameter base model and a chat model tailored for practical scenarios. The model has the following characteristics: - It leverages trillions of high-quality tokens for training to establish a powerful knowledge base. - It supports an 8k context window length, enabling longer input sequences and stronger reasoning capabilities. - It provides a versatile toolset for users to flexibly build their own workflows.

•It leverages trillions of high-quality tokens for training to establish a powerful knowledge base
•It supports an 8k context window length, enabling longer input sequences and stronger reasoning capabilities
•It provides a versatile toolset for users to flexibly build their own workflows

Related models

Your hardware

Detecting...

Quick picks

Best budgetA

RX 7600 XT 16GB~$329 — 39 tok/s

Best overallA

RX 7900 XT 20GB~$899 — 98 tok/s

Best hardware

Top picks for InternLM Chat 7B

RX 7900 XT 20GBA

20 GB

👁 NVIDIA

RTX A4500 20GBA

20 GB

👁 NVIDIA

RTX 3090 24GBA

24 GB

👁 NVIDIA

RTX 3090 Ti 24GBA

24 GB

👁 NVIDIA

RTX 4090 24GBA

24 GB

Run this model

InternLM Chat 7B on RX 7900 XT 20GB InternLM Chat 7B on RTX A4500 20GB InternLM Chat 7B on RTX 3090 24GB

Quantization options

VRAM estimates by quant level

No hardware detected — fit column shows raw VRAM estimates

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	2.7 GB	Low	—
Q3_K_S	3	3.4 GB	Low	—
NVFP4	4	3.9 GB	Medium	—
Q4_K_M	4	4.3 GB	Medium	—
Q5_K_M	5	5.0 GB	High	—
Q6_K	6	5.7 GB	High	—
Q8_0	8	7.5 GB	Very High	—
F16	16	14.3 GB	Maximum	—

Hardware compatibility

Fit estimates across all hardware

Open calculator

Computing compatibility...

Memory breakdown

Reference: RTX 2060 6GB

Weights4.3 GB

KV Cache7.8 GB

Runtime1.2 GB

Headroom0.6 GB

Frequently asked questions

URL: https://willitrunai.com/models/internlm-chat-7b

⇱ InternLM Chat 7B VRAM Requirements — GPU Compatibility

InternLM Chat 7B

Top picks for InternLM Chat 7B

VRAM estimates by quant level

Fit estimates across all hardware

Reference: RTX 2060 6GB

FAQ — InternLM Chat 7B