9 items • Updated
Model Card
- Base model:
meta-llama/Llama-2-13b-hf - Quantization method: BlockLDLQ with GuidedQuant Hessian
- Target bit-width: 2
- Backend kernel: QTIP kernel (HYB variant)
- Calibration data: RedPajama (1024 sentences / 4096 tokens)
- Calibration objective: Next-token prediction
- num_groups (for GuidedQuant Hessian): 4
How to run
- Follow the instruction in https://github.com/snu-mllab/GuidedQuant and https://github.com/Cornell-RelaxML/qtip
References
- Downloads last month
- 2
Safetensors
Model size
2B params
Tensor type
F32
·
F16 ·
I16 ·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for jusjinuk/Llama-2-13b-hf-2bit-GuidedQuant-QTIP
Base model
meta-llama/Llama-2-13b-hf