Compare
Compare local AI hardware with workload-aware output.
Operating mode: Balanced. Balanced for general local use. Keeps the ranking neutral across personal and serving workflows.
AMD Instinct MI325X 256GB wins for coding in balanced mode
Based on model fit, speed, and quality across top recommendations.
AMD Instinct MI325X 256GB
WinnerSDeepSeek V4 Flash
llama.cppNVFP4Runs well
185.8 GB / 256.0 GB
94.4 tok/s872K ctx
SDevstral 2 123B Instruct
llama.cppQ4_K_MRuns well
106.9 GB / 256.0 GB
63.5 tok/s256K ctx
SMiniMax M2.7
llama.cppUD-IQ4_XSRuns well
170.6 GB / 256.0 GB
101.4 tok/s205K ctx
NVIDIA GB200 192GB
SDevstral 2 123B Instruct
llama.cppQ4_K_MRuns well
100.5 GB / 192.0 GB
97.4 tok/s256K ctx
SQwen 3.5 122B A10B
llama.cppQ4_K_MRuns well
97.0 GB / 192.0 GB
270.2 tok/s131K ctx
SMistral Small 4 119B
llama.cppQ4_K_MRuns well
98.1 GB / 192.0 GB
292.9 tok/s256K ctx
Quick comparison
| Metric | AMD Instinct MI325X 256GB | NVIDIA GB200 192GB |
|---|---|---|
| Models that fit | 3 | 3 |
| Avg decode tok/s | 86.4 | 220.2 |
| Best grade score | 99 | 96 |
