Qwen3.5-35B-Optimized-HauhauCS
Join the Discord for updates, roadmaps, projects, or just to chat.
Optimized Qwen3.5-35B-A3B by HauhauCS.
Access
This is currently a Closed Beta release designed to lower (V)RAM requirements by up to 50% without sacrificing real world capabilities.
Downloads
| File | Type | Size |
|---|---|---|
| Qwen3.5-35B-Optimized-HauhauCS-40-Q8_K_P.gguf | Q8_K_P | 25 GB |
| Qwen3.5-35B-Optimized-HauhauCS-40-Q4_K_P.gguf | Q4_K_P | 14 GB |
| mmproj-Qwen3.5-35B-Optimized-HauhauCS-40-f16.gguf | mmproj (F16) | 858 MB |
What are K_P quants?
K_P quants use model-specific importance analysis to selectively preserve quality where it matters most. Fully compatible with llama.cpp, LM Studio, and any GGUF runtime.
Specs
- 35B-A3B MoE (35B total, ~3B active per forward pass)
- 262K context
- Multimodal (vision support via mmproj)
- Based on Qwen3.5-35B-A3B
Usage
Works with llama.cpp, LM Studio, Jan, koboldcpp, etc.
llama-cli -m Qwen3.5-35B-Optimized-HauhauCS-40-Q8_K_P.gguf --mmproj mmproj-Qwen3.5-35B-Optimized-HauhauCS-40-f16.gguf -ngl 99
Note: K_P quants may show as "?" in LM Studio's quant column โ display issue only, loads and runs fine.
- Downloads last month
- -
GGUF
Model size
22B params
Architecture
qwen35moe
Hardware compatibility
Log In to add your hardware
We're not able to determine the quantization variants.
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
