Falcon 7B Instruct

Legacy

165.8KDownloads1.0KLikesApr 2023Released8K tokensContextApache 2.0License40 BasicQuality

Falcon 7B Instruct (7B parameters) requires approximately 5.9 GB of VRAM with Q4_K_M quantization. For the best balance of quality and speed, we recommend hardware with at least 7 GB of VRAM.

Get started

— copy & paste to run locally

Copy-paste commands to run Falcon 7B Instruct on your machine.

Run

lms load falcon-7b-instruct && lms server start

Quick specs

Parameters7B

Architecturedense

Context8K tokens

Modalitytext

Min RAM2.7 GB

Rec. RAM4.3 GB (Q4_K_M)

LicenseApache 2.0

FamilyFalcon

✓ Chat✓ Reasoning

About this model

Falcon-7B-Instruct is a 7B parameters causal decoder-only model built by TII based on Falcon-7B and finetuned on a mixture of chat/instruct datasets. It is made available under the Apache 2.0 license.

•You are looking for a ready-to-use chat/instruct model based on Falcon-7B
•Falcon-7B is a strong base model, outperforming comparable open-source models: (e.g., MPT-7B, StableLM, RedPajama etc.), thanks to being trained...
•It features an architecture optimized for inference: , with FlashAttention (Dao et al., 2022) and multiquery (Shazeer et al., 2019)

Related models

Your hardware

Detecting...

Quick picks

👁 Intel

Best budgetA

Intel Arc A580 8GB~$179 — 65 tok/s

👁 NVIDIA

Best overallA

RTX 3070 Ti 8GB~$599 — 98 tok/s

Best hardware

Top picks for Falcon 7B Instruct

👁 NVIDIA

RTX 3070 Ti 8GBA

8 GB

👁 NVIDIA

RTX 3070 8GBA

8 GB

👁 NVIDIA

RTX 3060 Ti 8GBA

8 GB

👁 NVIDIA

RTX 3080 Ti 12GBA

12 GB

👁 NVIDIA

RTX 3080 12GBA

12 GB

Run this model

Falcon 7B Instruct on RTX 3070 Ti 8GB Falcon 7B Instruct on RTX 3070 8GB Falcon 7B Instruct on RTX 3060 Ti 8GB

Quantization options

VRAM estimates by quant level

No hardware detected — fit column shows raw VRAM estimates

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	2.7 GB	Low	—
Q3_K_S	3	3.4 GB	Low	—
NVFP4	4	3.9 GB	Medium	—
Q4_K_M	4	4.3 GB	Medium	—
Q5_K_M	5	5.0 GB	High	—
Q6_K	6	5.7 GB	High	—
Q8_0	8	7.5 GB	Very High	—
F16	16	14.3 GB	Maximum	—

Quality benchmarks