AMD Ryzen AI Max+ "Strix Halo" Performance With ROCm 7.0

Written by Michael Larabel in Display Drivers on 22 September 2025 at 10:30 AM EDT. Page 3 of 5. 34 Comments.

AMD has been promoting more of Llama.cpp support recently and given its popularity was curious to see how it was working out on ROCm 7.0 HIP from this Strix Halo Framework Desktop.

👁 Llama.cpp benchmark with settings of Backend Comparison (Model: Qwen3-8B-Q8_0, Test: Text Generation 128). Vulkan was the fastest.

With the Llama.cpp performance I ran benchmarks just using the 16 CPU cores with the BLAS back-end and then testing both the Vulkan and ROCm HIP back-ends for the Radeon 8060S Graphics. For text generation with Qwen 3 8B, Vulkan was delivering slightly better performance than the ROCm HIP back-end on the AMD Ryzen AI Max+ 395.

👁 Llama.cpp benchmark with settings of Backend Comparison (Model: Qwen3-8B-Q8_0, Test: Prompt Processing 512). Vulkan was the fastest.

For prompt processing (PP512) there was a huge gain from using Vulkan on Strix Halo compared to the ROCm HIP back-end.

👁 Llama.cpp benchmark with settings of Backend Comparison (Model: Qwen3-8B-Q8_0, Test: Prompt Processing 1024). Vulkan was the fastest.

👁 Llama.cpp benchmark with settings of Backend Comparison (Model: Qwen3-8B-Q8_0, Test: Prompt Processing 2048). Vulkan was the fastest.

Even for PP1024 and PP2048, the Vulkan back-end was showing much better performance with Llama.cpp than going the HIP route.

👁 Llama.cpp benchmark with settings of Backend Comparison (Model: gpt-oss-20b-Q8_0, Test: Text Generation 128). Vulkan was the fastest.

With the new GPT-OSS 20B model on Strix Halo, Vulkan was again showing better on the RADV driver than ROCm 7.0 HIP for the Radeon 8060S Graphics with the Ryzen AI Max+ 395.

👁 Llama.cpp benchmark with settings of Backend Comparison (Model: gpt-oss-20b-Q8_0, Test: Prompt Processing 512). Vulkan was the fastest.

👁 Llama.cpp benchmark with settings of Backend Comparison (Model: gpt-oss-20b-Q8_0, Test: Prompt Processing 1024). Vulkan was the fastest.

👁 Llama.cpp benchmark with settings of Backend Comparison (Model: gpt-oss-20b-Q8_0, Test: Prompt Processing 2048). Vulkan was the fastest.

For prompt processing with GPT-OSS 20B, the Vulkan performance continued to show a huge advantage over ROCm HIP.

34 Comments - Next Page

Page: 1 2 3 4 5 Next Page

URL: https://www.phoronix.com/review/amd-rocm-7-strix-halo/3

⇱ AMD Ryzen AI Max+ "Strix Halo" Performance With ROCm 7.0 - Phoronix

AMD Ryzen AI Max+ "Strix Halo" Performance With ROCm 7.0