VOOZH about

URL: https://apxml.com/models/qwen35-35b-a3b

⇱ Qwen3.5-35B-A3B: Specifications and GPU VRAM Requirements


Qwen3.5-35B-A3B

Active Parameters

35B

Context Length

262K

Modality

Multimodal

Architecture

Mixture of Experts (MoE)

License

Apache 2.0

Release Date

24 Feb 2026

Knowledge Cutoff

-

Technical Specifications

Attention

Attention Structure

Grouped-Query Attention

Attention Heads

16

Key-Value Heads

2

Attention Head Dimension

256

Position Embedding

ROPE

RoPE Theta

10,000,000

Sliding Window Attention

No

Sliding Window Size

-

Normalization

RMS Normalization

Activation Function

SwigLU

Dimensions

Hidden Dimension Size

2,048

Number of Layers

40

FFN Intermediate Size (Dense)

512

Multi-Token Prediction Heads

1

Tokenizer

Vocabulary Size

248,320

Mixture of Experts

Total Expert Parameters

3.0B

Number of Experts

256

Active Experts

9

Shared Experts

-

FFN Intermediate Size (per Expert)

512

Dense Layers Before MoE

-

Architecture Diagram

Qwen3.5-35B-A3B

Qwen3.5-35B-A3B is Alibaba Cloud's efficient multimodal foundation model, released February 2026. With 35B total parameters and 3B activated through a Mixture-of-Experts architecture (256 experts), it delivers strong performance with minimal compute. It achieves MMLU-Pro (85.3%), GPQA Diamond (84.2%), SWE-bench Verified (69.2%), and Terminal-Bench 2.0 (40.5%). Qwen3.5-Flash is the hosted API version. Features unified vision-language capabilities, 262k native context (extensible to 1M), and strong performance on multimodal reasoning, coding, and multilingual tasks.

About Qwen 3.5

Qwen 3.5 is Alibaba Cloud's latest-generation foundation model family, released February 2026. It represents a significant leap forward, integrating breakthroughs in multimodal learning (unified vision-language foundation), efficient hybrid architecture (Gated Delta Networks with sparse Mixture-of-Experts), scalable reinforcement learning across million-agent environments, and global linguistic coverage spanning 201 languages. Available under Apache 2.0 license with open weights.


Other Qwen 3.5 Models

Evaluation Benchmarks

Rank

#101

BenchmarkScoreRank

General Text

Text Arena

1396

54

Web Development

WebDev Arena

1249

89

Rankings

Overall Rank

#101

Coding Rank

#104

Model Integrity

Total Score

B+

72 / 100

GPU Requirements

Full Calculator

Choose the quantization method for model weights

Context Size: 1,024 tokens

1k
128k
256k

VRAM Required:

Recommended GPUs