![]() |
VOOZH | about |
Parameters
9B
Context Length
128K
Modality
Text
Architecture
Dense
License
MIT License
Release Date
30 Jun 2024
Knowledge Cutoff
Dec 2023
Attention
Attention Structure
Multi-Head Attention
Attention Heads
32
Key-Value Heads
2
Attention Head Dimension
128
Position Embedding
Absolute Position Embedding
RoPE Theta
-
Sliding Window Attention
No
Sliding Window Size
-
Normalization
RMS Normalization
Activation Function
SwigLU
Dimensions
Hidden Dimension Size
4,096
Number of Layers
40
FFN Intermediate Size (Dense)
13,696
Multi-Token Prediction Heads
-
Tokenizer
Vocabulary Size
151,552
The GLM-4-9B-Chat model is a conversational large language model developed by the Knowledge Engineering Group (KEG) at Tsinghua University in collaboration with Z.ai. As a core component of the fourth-generation General Language Model (GLM) series, this variant is specifically optimized for human-preference alignment and complex multi-turn dialogue. The model is trained on a massive corpus of 10 trillion tokens and supports multilingual communication across 26 languages, making it a highly versatile tool for global conversational applications.
Architecturally, GLM-4-9B-Chat is built on a dense transformer framework utilizing 40 layers with a hidden dimension of 4096. A significant technical innovation in this variant is the implementation of Grouped Query Attention (GQA), which employs two key-value heads to optimize memory bandwidth and inference throughput without sacrificing modeling quality. The architecture further incorporates Rotary Position Embeddings (RoPE) for improved length extrapolation and utilizes SwiGLU activation functions in its feed-forward networks, replacing traditional ReLU to enhance the model's non-linear representative capacity. Normalized using RMSNorm, the model maintains stable training dynamics across its parameter space.
GLM-4-9B-Chat is engineered to handle extended context windows up to 128,000 tokens, enabling it to maintain coherence over long documents and extensive conversational histories. Beyond standard text generation, the model integrates sophisticated tool-use capabilities, including autonomous web browsing, Python code execution, and custom function calling. These features allow the model to interact with external environments to solve mathematical problems and perform real-time information retrieval, making it suitable for deployment in advanced AI assistants and automated agentic systems.
General Language Models from Z.ai
No evaluation benchmarks for GLM-4-9B-Chat available.
Overall Rank
-
Coding Rank
-
Total Score
B
68 / 100
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens
©2025 ApX Machine Learning
APX AI
Online