Voozh

This page tracks the Qwen3.6-35B-A3B release date and the full Qwen 3.6 family timeline, including GGUF quantizations and client integrations. Updated daily as the timeline develops.

Current status (April 23, 2026) — RELEASED

Channel	Status	Date
Alibaba Cloud API (Qwen 3.6 Plus Preview)	✅ Live	March 30, 2026
Open-weight Qwen3.6-35B-A3B on Hugging Face	✅ Live	April 16, 2026
Open-weight Qwen3.6-27B dense on Hugging Face	✅ Live	April 22, 2026
GGUF quantizations (unsloth, ggml-org, bartowski)	✅ Live	Within 24-48h of HF
vLLM (≥0.19.0)	✅ Supported	April 17, 2026
SGLang (≥0.5.10)	✅ Supported	April 17, 2026
LM Studio	✅ Supported	Rolling
Jan	✅ Supported	Rolling
Ollama library	⏳ In progress	Pending mmproj vision file support

Actual release timeline

The family rolled out in three waves:

March 30-31, 2026 — Qwen 3.6 Plus API preview on Alibaba Cloud + free access via OpenRouter
April 16, 2026 — Qwen3.6-35B-A3B open weights under Apache 2.0
April 22, 2026 — Qwen3.6-27B dense open weights, with surprising flagship-level coding benchmarks

The API-to-open-weight gap was 17 days for the 35B-A3B (longer than the 11 days of Qwen 3.5 → Qwen 3.5 OW), and a further 6 days for the 27B dense variant.

Download Qwen 3.6 now

Since open weights shipped, the fastest paths to run Qwen 3.6 locally:

Qwen3.6-35B-A3B MoE (~21 GB Q4, best for fast chat):

# Unsloth GGUF
huggingface-cli download unsloth/Qwen3.6-35B-A3B-GGUF Qwen3.6-35B-A3B-Q4_K_M.gguf

# Or via vLLM
pip install "vllm>=0.19.0"
vllm serve Qwen/Qwen3.6-35B-A3B --max-model-len 262144

Qwen3.6-27B dense (~16.8 GB Q4, best for coding — fits 16 GB GPUs):

# Unsloth GGUF
huggingface-cli download unsloth/Qwen3.6-27B-GGUF Qwen3.6-27B-UD-Q4_K_XL.gguf

# Or via vLLM
vllm serve Qwen/Qwen3.6-27B --max-model-len 262144 --reasoning-parser qwen3

See the dedicated pages for exact quantization tables and buyer advice:

Monitor future releases

Qwen HF organization — official weights
Unsloth Qwen 3.6 docs — recommended GGUFs
Alibaba QwenLM GitHub — source + training details

Key differences vs Qwen 3.5 35B-A3B

1M-token native context (vs 262K in Qwen 3.5). Long documents, multi-file codebases, and agentic workflows benefit the most.
Same 35B total / 3B active parameters — VRAM and tokens/sec are nearly identical at short context.
KV cache grows significantly at full 1M context — plan for 20-40 GB of extra VRAM if you push context past 256K.

Qwen 3.6 VRAM & Hardware Requirements (35B-A3B) — exact Q4/Q5/Q6/Q8/FP16 numbers + which GPU to buy
Qwen3.6-27B VRAM & Hardware Requirements — the dense 27B sibling
Qwen 3.5 35B-A3B VRAM Requirements — current-generation sibling
Qwen 3 / 3.5 Family GPU Requirements — original family overview

URL: https://willitrunai.com/blog/qwen3-6-35b-a3b-release-date

⇱ Qwen3.6-35B-A3B Release Date — Open-Weight & GGUF Timeline (April 2026) | Will It Run AI Blog

Current status (April 23, 2026) — RELEASED

Actual release timeline

Download Qwen 3.6 now

Monitor future releases

Key differences vs Qwen 3.5 35B-A3B

Related pages

Frequently Asked Questions