VOOZH about

URL: https://willitrunai.com/blog/qwen3-6-35b-a3b-release-date

⇱ Qwen3.6-35B-A3B Release Date — Open-Weight & GGUF Timeline (April 2026) | Will It Run AI Blog


This page tracks the Qwen3.6-35B-A3B release date and the full Qwen 3.6 family timeline, including GGUF quantizations and client integrations. Updated daily as the timeline develops.

Current status (April 23, 2026) — RELEASED

ChannelStatusDate
Alibaba Cloud API (Qwen 3.6 Plus Preview)✅ LiveMarch 30, 2026
Open-weight Qwen3.6-35B-A3B on Hugging Face✅ LiveApril 16, 2026
Open-weight Qwen3.6-27B dense on Hugging Face✅ LiveApril 22, 2026
GGUF quantizations (unsloth, ggml-org, bartowski)✅ LiveWithin 24-48h of HF
vLLM (≥0.19.0)✅ SupportedApril 17, 2026
SGLang (≥0.5.10)✅ SupportedApril 17, 2026
LM Studio✅ SupportedRolling
Jan✅ SupportedRolling
Ollama library⏳ In progressPending mmproj vision file support

Actual release timeline

The family rolled out in three waves:

  1. March 30-31, 2026 — Qwen 3.6 Plus API preview on Alibaba Cloud + free access via OpenRouter
  2. April 16, 2026 — Qwen3.6-35B-A3B open weights under Apache 2.0
  3. April 22, 2026 — Qwen3.6-27B dense open weights, with surprising flagship-level coding benchmarks

The API-to-open-weight gap was 17 days for the 35B-A3B (longer than the 11 days of Qwen 3.5 → Qwen 3.5 OW), and a further 6 days for the 27B dense variant.

Download Qwen 3.6 now

Since open weights shipped, the fastest paths to run Qwen 3.6 locally:

Qwen3.6-35B-A3B MoE (~21 GB Q4, best for fast chat):

# Unsloth GGUF
huggingface-cli download unsloth/Qwen3.6-35B-A3B-GGUF Qwen3.6-35B-A3B-Q4_K_M.gguf

# Or via vLLM
pip install "vllm>=0.19.0"
vllm serve Qwen/Qwen3.6-35B-A3B --max-model-len 262144

Qwen3.6-27B dense (~16.8 GB Q4, best for coding — fits 16 GB GPUs):

# Unsloth GGUF
huggingface-cli download unsloth/Qwen3.6-27B-GGUF Qwen3.6-27B-UD-Q4_K_XL.gguf

# Or via vLLM
vllm serve Qwen/Qwen3.6-27B --max-model-len 262144 --reasoning-parser qwen3

See the dedicated pages for exact quantization tables and buyer advice:

Monitor future releases

Key differences vs Qwen 3.5 35B-A3B

  • 1M-token native context (vs 262K in Qwen 3.5). Long documents, multi-file codebases, and agentic workflows benefit the most.
  • Same 35B total / 3B active parameters — VRAM and tokens/sec are nearly identical at short context.
  • KV cache grows significantly at full 1M context — plan for 20-40 GB of extra VRAM if you push context past 256K.

Related pages

Frequently Asked Questions