VOOZH about

URL: https://huggingface.co/google/gemma-4-26B-A4B-it/discussions/43

⇱ google/gemma-4-26B-A4B-it · [Showcase] Running Gemma-4-26B-A4B-it on 8GB RAM Smartphone


[Showcase] Running Gemma-4-26B-A4B-it on 8GB RAM Smartphone

#43
by InfiniteVoid - opened

Day 3 of building a custom Vulkan external-MoE inference path on top of llama.cpp/GGUF.

Gemma 4 26B A4B Q4_0 running locally on a phone.

Phone: Poco X4 GT / Xiaomi 22041216G / xaga, MT6895, Android 14, 8 GB RAM.

Specs:

context: 512
thinking: on, budget 16
prompt: 0.81 tok/s
generation: 0.38 tok/s
test run: 412 generated tokens

Fully local, no cloud, running through llama.cpp

👁 phone

· Sign up or log in to comment