VOOZH
about
URL: https://dev.to/t/quantization
⇱ Quantization - DEV Community
How to Pick a GGUF Quant Level for Your VRAM Budget
👁 pat9000 profile
Patrick Hughes
👁 Image
Patrick Hughes
Jun 11
How to Pick a GGUF Quant Level for Your VRAM Budget
#
localllm
#
gguf
#
quantization
#
gpu
Add Comment
3 min read
Gemma 4 QAT on a 1080 Ti: What 'Quantization-Aware' Actually Buys — and Fitting the 12B on 8 GB at 16k
👁 sysoft profile
byeongsoo kang
👁 Image
byeongsoo kang
Jun 11
Gemma 4 QAT on a 1080 Ti: What 'Quantization-Aware' Actually Buys — and Fitting the 12B on 8 GB at 16k
#
llm
#
machinelearning
#
gemma
#
quantization
Add Comment
5 min read
Quantization formats compared: GGUF vs GPTQ vs AWQ vs NF4
👁 tech_nuggets profile
Tech_Nuggets
👁 Image
Tech_Nuggets
Jun 11
Quantization formats compared: GGUF vs GPTQ vs AWQ vs NF4
#
llm
#
quantization
#
mlops
#
tutorial
Add Comment
7 min read
INT8 Q/DQ Calibration on Blackwell: 1.8 the TRT 10 + FP16 Baseline
👁 soytuber profile
soy
👁 Image
soy
Jun 10
INT8 Q/DQ Calibration on Blackwell: 1.8 the TRT 10 + FP16 Baseline
#
tensorrt
#
quantization
#
gpu
#
machinelearning
Add Comment
7 min read
GGUF Quantization Explained: Q4_K_M vs Q5_K_M vs Q8 — Which to Pick (2026)
👁 pat9000 profile
Patrick Hughes
👁 Image
Patrick Hughes
May 13
GGUF Quantization Explained: Q4_K_M vs Q5_K_M vs Q8 — Which to Pick (2026)
#
llamacpp
#
gguf
#
quantization
#
localai
Add Comment
4 min read
1-bit, 545 megabytes, zero API keys — local AI that beats GPT-5.4
👁 vystartasv profile
Vilius
👁 Image
Vilius
May 9
1-bit, 545 megabytes, zero API keys — local AI that beats GPT-5.4
#
ai
#
llm
#
local
#
quantization
👁 Image
2
reactions
1
comment
2 min read
Why your quantized LLM loses its MTP heads and how to keep them
👁 alanwest profile
Alan West
👁 Image
Alan West
May 27
Why your quantized LLM loses its MTP heads and how to keep them
#
machinelearning
#
llm
#
python
#
quantization
👁 Image
1
reaction
Add Comment
5 min read
KVQuant: Run 70B LLMs on 8GB RAM with KV Cache Quantization
👁 aman_sachan_126d19c4a2773 profile
Aman Sachan
👁 Image
Aman Sachan
Apr 30
KVQuant: Run 70B LLMs on 8GB RAM with KV Cache Quantization
#
python
#
llm
#
quantization
Add Comment
1 min read
KVQuant: Run 70B LLMs on 8GB RAM with 4-bit KV Cache Quantization
👁 aman_sachan_126d19c4a2773 profile
Aman Sachan
👁 Image
Aman Sachan
Apr 30
KVQuant: Run 70B LLMs on 8GB RAM with 4-bit KV Cache Quantization
#
python
#
llm
#
quantization
#
optimization
Add Comment
1 min read
Traditional Quantization vs 1.58-Bit Ternary Models: A Practical Comparison
👁 alanwest profile
Alan West
👁 Image
Alan West
Apr 18
Traditional Quantization vs 1.58-Bit Ternary Models: A Practical Comparison
#
machinelearning
#
llm
#
quantization
#
ai
1
comment
5 min read
The Best Result This Week Was a Failed Prediction — Phase-3a Doesn't Transfer
👁 mxguru1 profile
MxGuru
👁 Image
MxGuru
May 20
The Best Result This Week Was a Failed Prediction — Phase-3a Doesn't Transfer
#
quantization
#
hsaq
#
methodology
#
granite
Add Comment
1 min read
Two Localizers, Both Wrong: Bounding a Quantization Cost That Wouldn't Close
👁 mxguru1 profile
MxGuru
👁 Image
MxGuru
May 20
Two Localizers, Both Wrong: Bounding a Quantization Cost That Wouldn't Close
#
quantization
#
hsaq
#
methodology
#
granite
Add Comment
1 min read
When the Sensitivity Metric Lies: A Drift-Inversion Smoking Gun in Mixed-Precision LLM Quantization
👁 mxguru1 profile
MxGuru
👁 Image
MxGuru
May 20
When the Sensitivity Metric Lies: A Drift-Inversion Smoking Gun in Mixed-Precision LLM Quantization
#
quantization
#
hsaq
#
awq
#
granite
Add Comment
8 min read
GIMP's Posterization: Simple Quantization vs. Median Cut for Better Visuals
👁 denlava profile
Denis Lavrentyev
👁 Image
Denis Lavrentyev
Apr 13
GIMP's Posterization: Simple Quantization vs. Median Cut for Better Visuals
#
gimp
#
posterization
#
quantization
#
mediancut
Add Comment
8 min read
Q4 KV Cache Fit 32K Context into 8GB VRAM — Only Math Broke
👁 plasmon_imp profile
plasmon
👁 Image
plasmon
Apr 8
Q4 KV Cache Fit 32K Context into 8GB VRAM — Only Math Broke
#
llm
#
quantization
#
vram
#
localllm
Add Comment
8 min read
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
👁 DEV Community
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account
👁 Image
👁 Image
👁 Image
👁 Image
👁 Image