qwen3-4b-uzbek-v2-bnb-4bit

bitsandbytes nf4 4-bit quant (~3.4 gb) of inspirebek/qwen3-4b-uzbek-v2. nvidia gpu only; easiest hf-native 4-bit load.

usage

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch

bnb = BitsAndBytesConfig(
 load_in_4bit=True,
 bnb_4bit_quant_type="nf4",
 bnb_4bit_compute_dtype=torch.bfloat16,
 bnb_4bit_use_double_quant=True,
)
tok = AutoTokenizer.from_pretrained("inspirebek/qwen3-4b-uzbek-v2-bnb-4bit")
model = AutoModelForCausalLM.from_pretrained(
 "inspirebek/qwen3-4b-uzbek-v2-bnb-4bit",
 quantization_config=bnb,
 device_map="auto",
)

quantization

method: bitsandbytes nf4 (4-bit normalfloat)
double quantization: enabled
compute dtype: bfloat16

datasets

stage a — fluency (continued pretraining):

stage b — instruct (sft):

saillab/alpaca_uzbek_taco · CC-BY-NC-4.0
behbudiy/alpaca-cleaned-uz · CC-BY-4.0
UAzimov/uzbek-instruct-llm · Apache-2.0
CohereLabs/aya_collection_language_split · Apache-2.0
med-alex/qa_mt_ru_to_uzn · unspecified
med-alex/qa_mt_tr_to_uzn · unspecified

⚠️ licensing note: saillab/alpaca_uzbek_taco is cc-by-nc-4.0, which restricts commercial use of derivative models. downstream users who need a fully permissive license should retrain without that subset.

sibling formats

Downloads last month: 17

Safetensors

Model size

5B params

Tensor type

F32

F16

Model tree for inspirebek/qwen3-4b-uzbek-v2-bnb-4bit

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B

Adapter

inspirebek/qwen3-4b-uzbek-v2

Quantized

(3)

this model

Datasets used to train inspirebek/qwen3-4b-uzbek-v2-bnb-4bit

Collection including inspirebek/qwen3-4b-uzbek-v2-bnb-4bit

all distribution formats of the uzbek fine-tune of qwen3-4b: merged bf16, lora adapter, bnb nf4, awq, and the gguf suite. • 5 items • Updated Apr 20 • 1

URL: https://huggingface.co/inspirebek/qwen3-4b-uzbek-v2-bnb-4bit

⇱ inspirebek/qwen3-4b-uzbek-v2-bnb-4bit · Hugging Face