I don't know about you, but the words mini PC don't usually conjure up something with enough power to beat desktop workstations at AI or productivity tasks. Nor does it make me think of something that can fit almost any locally hosted LLM (Large Language Model) that you could care to run, in VRAM, no less.
At least, it didn't, until I got my hands on a stupendously stupidly speedy shoebox with AMD's top-of-the-line Strix Halo APU inside. I've been trying to force it to its knees using a succession of larger LLMs, and it hasn't blinked once. What's more, it beats every desktop GPU that I have on hand, and every consumer-level one that I don't yet own. It's glorious overkill for most desktop uses, but it comes into its own while running local AI, and I mostly plan on using it for that.
How overpowered is "over" anyway?
AMD's Strix Halo AI 395+ APU is one spicy chip
Now, there are about a dozen different mini PCs and tablets that all use the Strix Halo APU, specifically the Ryzen AI Max+ 395, which is in this model from Bosgame, who offer it with 96GB or 128GB of LPDDR5X 8000-DDR5 memory. Of course, I opted for the 128GB version, because the RAM is soldered on, so you can't change it afterward. Yes, you read that correctly, this small PC has more RAM than most desktops, and because it's an APU, that RAM is unified and can be used as system memory or VRAM.
And that's the crucial point for running local AI models. They love VRAM, lots and lots of it, and run much faster when the entire model can be kept in memory. Models are partially categorized by how many billion parameters they have, with 70b models typically needing at least two RTX 3090s to run comfortably. That's a graphics card that cost $1,500 each at launch, being outpaced by a mini PC. Sure, it's a mini PC that costs slightly more than one GPU, but it can do plenty more than just run AI models and doesn't need any other hardware except a keyboard, mouse, and display to run.
Bosgame M5 AI Mini Desktop
- RAM
- 128GB LPDDR5X DDR5-8000
- CPU Speed
- Base 3GHz, Boost up to 5.1GHz
- Brand
- Bosgame
- CPU
- AMD Ryzen AI Max+ 395, 16 cores, 32 threads
- GPU
- Integrated Radeon 8060S with 40 RDNA 3.5 GPU cores
- Connectivity
- Wi-Fi 7, Bluetooth 5.2
- Ports
- 2 x USB4 Type-C, 3 x USB 3.2 Gen2, 2 x USB 2.0, 1 x HDMI 2.1, 1 x DisplayPort 1.4, 2 x 2.5Gbps LAN, 1 x microSDXC card reader, 2 x 3.5mm audio
- Storage
- 2 x M.2 slots
Most desktops run out of VRAM before this mini PC will
128GB of unified RAM means lots of space for LLM models
My desktop PC currently has 16GB of VRAM thanks to an Nvidia discrete graphics card, and it chokes up when I try to run any LLMs with more than 15B parameters. In truth, if I had a motherboard that could support every GPU I have here, I'd be hard-pressed to get 128GB of VRAM in total, and my setup would be janky with a capital J, and probably need several PSUs chained together to give PCIe power.
But the Strix Halo? Happily chewing on OpenAI's gpt-oss 120B model, which has a downloaded size of 59.03GB according to LM Studio. It's not even breaking a sweat at 21.48 tokens per second. That's 15 times the speed of token generation if I used the Ryzen 7900X on my desktop to run the model. Even Mistral Large Instruct doesn't phase it, and that's 73.22GB in size.
At no point did Strix Halo even look like it was under stress, and the cooling fan in the mini PC was at barely noticeable levels. I can't understate how much simpler this makes everything when I don't have to look for workarounds or tweaks every time I want to run models for analysis or training. Next stages will be to wipe Windows and install Proxmox so I can load up on Linux VMs and containers, but I'm going to enjoy some gaming first.
LM Studio
But that's not all you can do on it
Strix Halo is a beast at everything
Strix Halo gives around the same graphical power as a desktop RTX 4070, a ton from this form factor. What it doesn't give is CUDA or tensor cores, so ray tracing is best kept off, but that's about all you have to worry about. Load all the shaders you want in Call of Duty, add those 4K texture packs to everything, and generally go about your day. With 128GB of very fast unified memory, worrying about system constraints is a thing of the past. Sure, some systems might be better at specific tasks like fully ray traced gaming, and that's why I've got a gaming PC with an Nvidia GPU inside. For everything else? This mini PC is superb.
I never would have thought a mini PC would outperform my gaming desktop
I'm genuinely blown away by the power of Strix Halo, and the rest of the package that Bosgame has designed around it. This is a true desktop-killer, and it's incredibly portable. I literally don't have a single system that can outperform this box for AI workloads, and it's no slouch for gaming, productivity, or anything else.
Bosgame M5 AI Mini Desktop
- RAM
- 128GB LPDDR5X DDR5-8000
- CPU Speed
- Base 3GHz, Boost up to 5.1GHz
