VOOZH about

URL: https://huggingface.co/isola-tropicale/Angelic_Eclipse_12B-AWQ-4bit

⇱ isola-tropicale/Angelic_Eclipse_12B-AWQ-4bit · Hugging Face


AWQ 4bit quantization of SicariusSicarii's Angelic_Eclipse_12B

Quantized on a single Nvidia RTX 4090.

Recipe:

from transformers import AutoModelForCausalLM, AutoTokenizer
from llmcompressor import oneshot
from llmcompressor.modifiers.awq import AWQModifier

dataset = "gsm8k"
model_id = "/path/to/model/"
SAVE_DIR = "/save/dir/"
MAX_SEQUENCE_LENGTH = 2048
NUM_CALIBRATION_SAMPLES = 64

tokenizer = AutoTokenizer.from_pretrained(
 model_id,
)

recipe = [
 AWQModifier(
 targets=["Linear"], 
 scheme="W4A16_ASYM", 
 ignore=["lm_head"],
 )
]

oneshot(
 model=model_id,
 dataset=dataset,
 dataset_config_name="main",
 recipe=recipe,
 output_dir=SAVE_DIR,
 max_seq_length=MAX_SEQUENCE_LENGTH,
 num_calibration_samples=NUM_CALIBRATION_SAMPLES,
)
Downloads last month
3
Safetensors
Model size
3B params
Tensor type
I64
·
I32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for isola-tropicale/Angelic_Eclipse_12B-AWQ-4bit