![]() |
VOOZH | about |
Active Parameters
754B
Context Length
200K
Modality
Text
Architecture
Mixture of Experts (MoE)
License
MIT
Release Date
7 Apr 2026
Knowledge Cutoff
-
Attention
Attention Structure
Multi-Layer Attention
Attention Heads
64
Key-Value Heads
64
Attention Head Dimension
64
Position Embedding
ROPE
RoPE Theta
1,000,000
Sliding Window Attention
No
Sliding Window Size
-
Normalization
RMS Normalization
Activation Function
SwigLU
Dimensions
Hidden Dimension Size
6,144
Number of Layers
78
FFN Intermediate Size (Dense)
2,048
Multi-Token Prediction Heads
1
Tokenizer
Vocabulary Size
154,880
Mixture of Experts
Total Expert Parameters
40.0B
Number of Experts
257
Active Experts
9
Shared Experts
1
FFN Intermediate Size (per Expert)
2,048
Dense Layers Before MoE
3
GLM-5.1 is Z.ai's flagship model for long-horizon agentic coding tasks. Built on a novel GlmMoeDSA architecture with 754B total parameters (256 routed + 1 shared experts, 8+1 active per token) across 78 layers, it combines Gated DeltaNet linear attention with standard attention and sparse MoE feed-forward networks — enabling efficient inference while delivering top-tier intelligence. Achieves state-of-the-art 58.4% on SWE-Bench Pro, 63.5% on Terminal-Bench 2.0, 95.3% on AIME 2026, and 86.2% on GPQA-Diamond. Uniquely designed for 8-hour sustained autonomous execution — breaking complex engineering tasks into iterative experiment-analyze-optimize loops. Supports 200K context window and 128K max output tokens. Available via API as glm-5.1 on Z.ai and BigModel.cn. Released April 7, 2026 under MIT license.
GLM-5.1 is Z.ai's next-generation flagship model for agentic engineering, built on a novel hybrid MoE architecture (GlmMoeDSA) combining Gated DeltaNet linear attention layers with standard attention and sparse MoE feed-forward networks. It achieves state-of-the-art performance on SWE-Bench Pro (58.4%) and is designed for long-horizon autonomous tasks, capable of sustained execution for up to 8 hours. With 754B total parameters and a 200K context window, GLM-5.1 delivers strong performance across coding, reasoning, tool use, and agentic benchmarks. Released open-source under the MIT License.
Rank
#5
| Benchmark | Score | Rank |
|---|---|---|
Web Development WebDev Arena | 1532 | ⭐ 7 |
General Text Text Arena | 1475 | ⭐ 7 |
Overall Rank
#5
Coding Rank
#18
Total Score
B
68 / 100
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens
©2025 ApX Machine Learning
APX AI
Online