![]() |
VOOZH | about |
Parameters
8B
Context Length
128K
Modality
Multimodal
Architecture
Dense
License
Apache 2.0
Release Date
2 Apr 2026
Knowledge Cutoff
-
Attention
Attention Structure
Grouped-Query Attention
Attention Heads
8
Key-Value Heads
2
Attention Head Dimension
256
Position Embedding
ROPE
RoPE Theta
10,000
Sliding Window Attention
Yes
Sliding Window Size
512
Normalization
RMS Normalization
Activation Function
GELU
Dimensions
Hidden Dimension Size
10,240
Number of Layers
42
FFN Intermediate Size (Dense)
10,240
Multi-Token Prediction Heads
-
Tokenizer
Vocabulary Size
262,144
Gemma 4 E4B is an edge-optimized model with 4.5B effective parameters (8B with Per-Layer Embeddings) for mobile and edge deployments. Supports multimodal input (text, image, audio) with 128K context window. Delivers enhanced performance over E2B while maintaining efficient on-device execution. Features thinking mode and native function calling.
Gemma 4 is Google DeepMind's most advanced open model family, built from Gemini 3 research and technology. Featuring both Dense and Mixture-of-Experts (MoE) architectures, these multimodal models handle text, images, and audio (on smaller variants), with context windows up to 256K tokens. Designed for frontier-level performance across reasoning, coding, and agentic workflows, Gemma 4 delivers unprecedented intelligence-per-parameter from mobile devices to enterprise servers. Released under Apache 2.0 license.
No evaluation benchmarks for Gemma 4 E4B available.
Overall Rank
-
Coding Rank
-
Total Score
B
68 / 100
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens
©2025 ApX Machine Learning
APX AI
Online