Voozh

AWQ 4bit quantization of SicariusSicarii's Angelic_Eclipse_12B

Quantized on a single Nvidia RTX 4090.

Recipe:

from transformers import AutoModelForCausalLM, AutoTokenizer
from llmcompressor import oneshot
from llmcompressor.modifiers.awq import AWQModifier

dataset = "gsm8k"
model_id = "/path/to/model/"
SAVE_DIR = "/save/dir/"
MAX_SEQUENCE_LENGTH = 2048
NUM_CALIBRATION_SAMPLES = 64

tokenizer = AutoTokenizer.from_pretrained(
 model_id,
)

recipe = [
 AWQModifier(
 targets=["Linear"], 
 scheme="W4A16_ASYM", 
 ignore=["lm_head"],
 )
]

oneshot(
 model=model_id,
 dataset=dataset,
 dataset_config_name="main",
 recipe=recipe,
 output_dir=SAVE_DIR,
 max_seq_length=MAX_SEQUENCE_LENGTH,
 num_calibration_samples=NUM_CALIBRATION_SAMPLES,
)

Downloads last month: 3

Safetensors

Model size

3B params

Tensor type

I64

I32

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for isola-tropicale/Angelic_Eclipse_12B-AWQ-4bit

Base model

mistralai/Mistral-Nemo-Base-2407

Finetuned

mistralai/Mistral-Nemo-Instruct-2407

Finetuned

SicariusSicariiStuff/Angelic_Eclipse_12B

Quantized

(17)

this model

URL: https://huggingface.co/isola-tropicale/Angelic_Eclipse_12B-AWQ-4bit

⇱ isola-tropicale/Angelic_Eclipse_12B-AWQ-4bit · Hugging Face

Model tree for isola-tropicale/Angelic_Eclipse_12B-AWQ-4bit