VOOZH about

URL: https://www.phoronix.com/review/amd-epyc-9575f-ai-server/3

⇱ AMD EPYC 9575F CPUs For GPU/AI Servers Show Leading Performance In Benchmarks Review - Phoronix


👁 Phoronix

AMD EPYC 9575F CPUs For GPU/AI Servers Show Leading Performance In Benchmarks

Written by Michael Larabel in Processors on 11 September 2025 at 08:30 AM EDT. Page 3 of 4. 10 Comments.

Beyond looking at what AMD already explored in their public blog post around vLLM latency-constrained throughput performance, I was also curious to push these competing Intel Xeon and AMD EPYC servers in some more neutral, upstream tests. Simply relying on upstream vLLM 0.10 installed via PIP and running some basic benchmarks using "vllm bench" in different scenarios for evaluating the latency as well as throughput with different models.

When looking at Qwen's QwQ 32B with vLLM, the AMD EPYC 9575F server was achieving lower latency than with the Intel Xeon Platinum server using the same eight NVIDIA H100 GPU configuration.

It is also worth pointing out when looking at the latency for the highest percentiles, the Xeon Platinum server was having much higher run-to-run variance than with the AMD EPYC 9575 Turin HF server. To keep below a 2.5% standard deviation between runs, the EPYC 9575F managed to do so in the typical three runs but for the Xeon server was going up to ~11 or so runs before bailing out.

When looking at the latency while running Qwen 2.5 72B across the eight NVIDIA H100 GPU servers, the AMD EPYC 9575F server continued achieving consistently lower latency with vLLM.

The AMD EPYC 9575F was delivering higher throughput as well in these vLLM benchmarks using the upstream vLLM.