AMD Ryzen AI Max+ "Strix Halo" Performance With ROCm 7.0
AMD has been promoting more of Llama.cpp support recently and given its popularity was curious to see how it was working out on ROCm 7.0 HIP from this Strix Halo Framework Desktop.
With the Llama.cpp performance I ran benchmarks just using the 16 CPU cores with the BLAS back-end and then testing both the Vulkan and ROCm HIP back-ends for the Radeon 8060S Graphics. For text generation with Qwen 3 8B, Vulkan was delivering slightly better performance than the ROCm HIP back-end on the AMD Ryzen AI Max+ 395.
For prompt processing (PP512) there was a huge gain from using Vulkan on Strix Halo compared to the ROCm HIP back-end.
Even for PP1024 and PP2048, the Vulkan back-end was showing much better performance with Llama.cpp than going the HIP route.
With the new GPT-OSS 20B model on Strix Halo, Vulkan was again showing better on the RADV driver than ROCm 7.0 HIP for the Radeon 8060S Graphics with the Ryzen AI Max+ 395.
For prompt processing with GPT-OSS 20B, the Vulkan performance continued to show a huge advantage over ROCm HIP.
