AMD EPYC 9575F CPUs For GPU/AI Servers Show Leading Performance In Benchmarks

Written by Michael Larabel in Processors on 11 September 2025 at 08:30 AM EDT. Page 4 of 4. 10 Comments.

👁 vLLM benchmark with settings of Test: Hermes-3-Llama-3.2-8B Latency. EPYC 9575F 2P was the fastest.

For smaller models too, the AMD EPYC 9575F still showed an advantage over the Intel Xeon Platinum alternative at a similar price point.

👁 vLLM benchmark with settings of Test: deepseek-moe-16b-chat Latency. EPYC 9575F 2P was the fastest.

When running the DeepSeek MOE 16b chat M.O.E. model, the AMD EPYC 9575F latency advantage was extremely clear-cut compared to the Intel Xeon Platinum H100 server.

👁 Blender benchmark with settings of Blend File: Classroom, Compute: NVIDIA CUDA. EPYC 9575F 2P was the fastest.

👁 Blender benchmark with settings of Blend File: Classroom, Compute: NVIDIA OptiX. EPYC 9575F 2P was the fastest.

👁 Blender benchmark with settings of Blend File: Pabellon Barcelona, Compute: NVIDIA OptiX. EPYC 9575F 2P was the fastest.

👁 NAMD benchmark with settings of Input: STMV with 1,066,628 Atoms. EPYC 9575F 2P was the fastest.

While everyone likes to talk about AI performance these days, for those evaluating host CPU options for GPU-accelerated servers for other workloads, the high frequency AMD EPYC 9575F was proving advantageous in other areas too like CUDA/OptiX rendering on the H100 or the NAMD molecular dynamics software. But most of my time was spent looking at the vLLM performance between these two servers for the time I had available to poke at these remote servers.

The AMD EPYC 9575F with its sixty-four Zen 5 cores and being able to boost up to 5.0GHz and sporting twelve channels of DDR5-6000/DDR5-6400 memory make it a leading option for GPU/AI servers. Compared to the Intel Xeon Platinum 8592+ server with the same eight NVIDIA H100 GPU configuration, the AMD EPYC 9575F was delivering consistently better performance as the host processors for these Supermicro AI servers. Particularly when it came to the performance in latency-constrained inference serving, AMD EPYC Turin-HF performed exceptionally well up against the Intel Xeon Platinum dual socket server with the same 64 core / 128 thread counts. Yes, the Xeon Platinum 8592+ is based on Emerald Rapids rather than Granite Rapids due to the limited availability so far still for GNR servers. As mentioned, I am looking forward to revisiting this comparison once having access to an applicable server.

As shown from this round of testing, the latency-constrained AI performance showcased by AMD earlier this summer panned out and was completely reproducible in my testing. The vLLM AI performance enjoyed higher throughput and lower latency in using the AMD EPYC 9575F server processors rather than the competing Intel Xeon CPUs. This shows the importance of the host CPU selection for GPU/AI servers and being yet another area where the AMD EPYC 9005 series is delivering leading performance.

Thanks to AMD for providing the gratis access to these two servers for exploring the performance of the host CPU performance for AI/GPU workloads. Outside of that you can see my dozens of other articles for plenty of other EPYC 9005 CPU benchmarks for those curious about the AMD EPYC Turin performance in other areas, including of the EPYC 9575F.

10 Comments

If you enjoyed this article consider joining Phoronix Premium to view this site ad-free, multi-page articles on a single page, and other benefits. PayPal or Stripe tips are also graciously accepted. Thanks for your support.

Page: 1 2 3 4

URL: https://www.phoronix.com/review/amd-epyc-9575f-ai-server/4

⇱ AMD EPYC 9575F CPUs For GPU/AI Servers Show Leading Performance In Benchmarks Review - Phoronix

AMD EPYC 9575F CPUs For GPU/AI Servers Show Leading Performance In Benchmarks