AMD Ryzen AI Max+ (Strix Halo) Gets Two New SKUs for Local LLM Systems
AMD has expanded its Ryzen AI Max+ Strix Halo family with two new processors, the Ryzen AI Max+ 392 and Ryzen AI Max+ 388. While these are officially new models, from a local LLM inference perspective they are best understood as cost-optimized variants of the existing Max+ 395 rather than a new performance tier.
Both new chips keep the same Strix Halo package. This means the same RDNA 3.5 integrated GPU with 40 compute units, the same 256-bit LPDDR5X memory interface, and support for up to 8000 MT/s memory. With four memory channels active, total memory bandwidth reaches 256 GB/s, identical to the flagship Max+ 395.
For local LLM workloads, this matters far more than CPU core count.
Inference Performance Should Match the Max+ 395
For quantized LLM inference, especially 4-bit models, performance on Strix Halo is dominated by memory bandwidth and available unified memory capacity. Since the 392 and 388 share the same GPU, same bus width, and same memory speed as the 395, inference throughput should be effectively the same across all three models when running GPU-accelerated backends.
In practical terms, running a 70B q4 model or pushing larger MoE setups that fit within unified memory computer should see no meaningful drop in tokens per second compared to a Max+ 395 system, assuming similar power limits.
This makes both new SKUs essentially a Max+ 395 with a less powerful CPU attached.
CPU Differences and Why They Matter Less for LLM Users
The Ryzen AI Max+ 392 drops to 12 cores and 24 threads using two CCDs, while the 388 goes further down to 8 cores and 16 threads on a single CCD. Boost clocks still reach up to 5.0 GHz, and the NPU remains rated at 50 TOPS.
For local LLM inference, the CPU mostly handles orchestration, data prep, and occasional CPU-side layers. Even the 8-core 388 is already well beyond what is required for these tasks. Unless you are compiling kernels, running heavy parallel preprocessing, or mixing inference with CPU-bound workloads, the reduced core count should not be a bottleneck.
Unified Memory Is Still the Real Selling Point
Like the rest of the Max+ lineup, the 392 and 388 support up to 128 GB of unified LPDDR5X memory. This remains the defining advantage over discrete GPUs at similar power levels and price ranges.
A single-node system with 128 GB of unified memory and 256 GB/s bandwidth can comfortably host models that would otherwise require multi-GPU setups with NVLink or PCIe juggling. For price-conscious local LLM enthusiasts, this simplifies builds, reduces power draw, and cuts platform complexity.
There is strong interest in seeing future OEM designs with 96 GB and especially 128 GB configurations become more common, along with clearer pricing. Higher-capacity SKUs are what make Strix Halo compelling for serious local inference.
Pricing Expectations and Why These SKUs Matter
Currently, the most affordable Ryzen AI Max+ 395 systems with 128 GB of memory tend to land between $2000 and $2500, depending on OEM and form factor. That price is workable but still high for many builders focused on performance per dollar.
The expectation is that systems built around the Ryzen AI Max+ 392 and 388 will come in cheaper, since AMD is clearly trading CPU silicon for cost savings while keeping the expensive parts of the package unchanged. If OEMs pass that reduction on, these new chips could become the most attractive entry point into high-bandwidth, large-memory local LLM machines.
Bottom Line for Local LLM Enthusiasts
From an inference standpoint, the Ryzen AI Max+ 392 and 388 should behave like the Max+ 395. Same GPU, same bandwidth, same unified memory limits. The only real difference is CPU headroom, which most local LLM users will not fully utilize anyway.
If pricing lands where it should, these new Strix Halo SKUs have the potential to offer better performance per dollar than the current flagship systems, without giving up what actually matters for running large quantized models locally.
Read more
AMD Strix Halo ROCm Crashes: Firmware Fix Is the Key Update
Game-Changer for Local LLMs: AMD Medusa Halo Leak Points to 384-Bit LPDDR6 Bandwidth
No comments yet.
