40 items • Updated • 2
Model Card
- Base model:
Qwen/Qwen3-32B - Quantization method: SqueezeLLM
- Target bit-width: 3
- Backend kernel: Any-Precision-LLM kernel (
ap-gemv) - Calibration data: RedPajama (1024 sentences / 4096 tokens)
- Calibration objective: Next-token prediction
How to run
- Follow the instruction in https://github.com/snu-mllab/GuidedQuant.
References
- Downloads last month
- 4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for jusjinuk/Qwen3-32B-3bit-SqueezeLLM
Base model
Qwen/Qwen3-32B