![]() |
VOOZH | about |
Parameters
-
Context Length
1M
Modality
Multimodal
Architecture
Dense
License
Proprietary
Release Date
1 Jun 2026
Knowledge Cutoff
-
Attention
Attention Structure
Multi-Head Attention
Attention Heads
64
Key-Value Heads
4
Attention Head Dimension
128
Position Embedding
Absolute Position Embedding
RoPE Theta
5,000,000
Sliding Window Attention
No
Sliding Window Size
-
Normalization
RMS Normalization
Activation Function
SwigLU
Dimensions
Hidden Dimension Size
6,144
Number of Layers
60
FFN Intermediate Size (Dense)
12,288
Multi-Token Prediction Heads
1
Tokenizer
Vocabulary Size
200,064
MiniMax's flagship multimodal model released June 1, 2026. Powered by MiniMax Sparse Attention (MSA) architecture, which replaces traditional full attention with a KV-block selection pattern, drastically reducing compute costs to 1/20th of the previous generation. It is highly optimized for long-horizon agentic workflows, complex software engineering, and video understanding. Features a 1M token context window, supports text, image, and video inputs, and is priced at $0.30 per million input tokens and $1.20 per million output tokens.
MiniMax's flagship M3 model family, released June 1, 2026, is powered by MiniMax Sparse Attention (MSA) architecture, offering 1M context capabilities at exceptionally low compute cost and optimized for long-horizon agentic workflows.
Rank
#14
| Benchmark | Score | Rank |
|---|---|---|
Web Development WebDev Arena | 1521 | ⭐ 10 |
General Text Text Arena | 1451 | 25 |
Overall Rank
#14
Coding Rank
#23
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens
©2025 ApX Machine Learning
APX AI
Online