![]() |
VOOZH | about |
Apr. 3, 2026 / Featured
The new Gemma 4 models from Google DeepMind have landed, and for local LLM users this is one of the more practical releases in a while. The lineup gives us two interesting mid-size targets: a 26B MoE model (A4B) and a 31B dense model. Both support up to 256K context, tool calling, and personal agent-style...
Nov. 3, 2025 / Hardware Insights
When one of YouTube’s biggest creators decides to build a personal AI supercomputer, the local LLM scene takes notice. PewDiePie’s journey into AI hardware has produced a multi-GPU, 424GB VRAM workstation that many enthusiasts dream of. While his budget is far beyond the average builder, his component choices and setup offer a valuable blueprint for...
Oct. 17, 2025 / LLM Benchmarks
I tested the RTX 4090 with five quantized models to measure real-world inference performance for local LLM workloads. This is the second article in my GPU benchmark series, following my recent RTX 5090 tests. I ran these benchmarks to provide concrete performance data across different model sizes and context lengths using llama.cpp. Testing Environment My...