AMD Radeon AI PRO R9700 Linux Performance For Single & Dual GPU Benchmarks

Written by Michael Larabel in Graphics Cards on 27 October 2025 at 09:00 AM EDT. Page 2 of 4. 24 Comments.

For this vLLM testing the rocm/vllm-dev Docker container was used for the latest ROCm support while for NVIDIA the latest ROCm vLLM 0.10 series was used for testing with that support being upstream and ready-to-go straight from PyPi.... AMD is working toward such nice out-of-the-box support for vLLM and other AI frameworks but in the case of vLLM hasn't yet crossed that milestone with the vLLM Docker containers still being recommended.

👁 vLLM benchmark with settings of Test: DeepSeek-R1-Distill-Qwen-7B Throughput. 2 x Radeon AI PRO R9700 was the fastest.

Kicking things off with the DeepSeek-R1-Distill-Qwen 7B model, the performance was very admirable for the Radeon AI PRO R9700. The performance of one Radeon AI PRO R9700 was not too far behind the NVIDIA RTX 6000 Ada Generation while the two graphics card configuration scaled nicely and provided a nice performance advantage over the RTX 6000 Ada Generation. The performance of the Radeon AI PRO R9700 with DeepSeek R1 was also substantially better than the prior generation Radeon PRO W7900.

👁 vLLM benchmark with settings of Test: DeepSeek-R1-Distill-Qwen-7B Throughput. 2 x Radeon AI PRO R9700 was the fastest.

Assuming these Radeon AI PRO R9700 graphics cards retail for anywhere close to the $1299 USD list price, they should sell like hot cakes. The NVIDIA RTX 6000 Ada Generation continues selling for $5300+ while two Radeon AI PRO R9700 graphics cards can outperform in models like DeepSeek R1 Distill Qwen 7B for around half the price. It would have been nice to see how the RTX PRO 6000 Blackwell performs but alas I haven't received any RTX PRO Blackwell review samples yet. AMD would undoubtedly lead though on value when comparing the current prices there.

👁 vLLM benchmark with settings of Test: DeepSeek-R1-Distill-Qwen-7B Throughput. 2 x Radeon AI PRO R9700 was the fastest.

The Radeon AI PRO R9700 was outperforming the prior-generation Radeon PRO W7900 with vLLM by wide margins even with the reduced memory bandwidth and lower AI accelerator count compared to that prior RDNA3 graphics card. On a performance-per-Watt basis for the single graphics cards, the Radeon AI PRO R9700 was competing very well with the RTX 6000 Ada Generation graphics card.

👁 vLLM benchmark with settings of Test: DeepSeek-R1-Distill-Qwen-7B Latency. 2 x Radeon AI PRO R9700 was the fastest.

For token latency with DeepSeek R1-Distill-Qwen 7B, the dual Radeon AI PRO R9700 put it at similar latency to the RTX 6000 Ada Generation graphics card. Which on a performance-per-dollar basis still put the Radeon AI PRO R9700 graphics cards well ahead of the NVIDIA comparison card.

Qwen 14B FP8 dynamic throughput

👁 vLLM benchmark with settings of Test: DeepSeek-R1-Distill-Qwen-14B-FP8-dynamic Throughput. RTX 6000 Ada Gen was the fastest.

👁 vLLM benchmark with settings of Test: DeepSeek-R1-Distill-Qwen-14B-FP8-dynamic Latency. RTX 6000 Ada Gen was the fastest.

For DeepSeek-R1-Distill-Qwen 14B FP8 dynamic with the dual Radeon AI PRO R9700 configuration its performance was just behind the single RTX 6000 Ada Generation graphics card but with the ~$1299 price holds, still has AMD easily winning on value.

24 Comments - Next Page

Page: 1 2 3 4 Next Page

URL: https://www.phoronix.com/review/amd-radeon-ai-pro-r9700/2

⇱ AMD Radeon AI PRO R9700 Linux Performance For Single & Dual GPU Benchmarks Review - Phoronix

AMD Radeon AI PRO R9700 Linux Performance For Single & Dual GPU Benchmarks