Compare
Compare local AI hardware with workload-aware output.
Operating mode: Balanced. Balanced for general local use. Keeps the ranking neutral across personal and serving workflows.
NVIDIA B200 180GB wins for coding in balanced mode
Based on model fit, speed, and quality across top recommendations.
NVIDIA B200 180GB
WinnerSDevstral 2 123B Instruct
llama.cppQ4_K_MRuns well
99.3 GB / 180.0 GB
97.4 tok/s256K ctx
SQwen 3.5 122B A10B
llama.cppQ4_K_MRuns well
95.8 GB / 180.0 GB
270.2 tok/s131K ctx
SMistral Small 4 119B
llama.cppQ4_K_MRuns well
96.9 GB / 180.0 GB
292.9 tok/s256K ctx
H100 NVL 188GB
SDevstral 2 123B Instruct
llama.cppQ4_K_MRuns well
100.1 GB / 188.0 GB
91.6 tok/s256K ctx
SQwen 3.5 122B A10B
llama.cppQ4_K_MRuns well
96.6 GB / 188.0 GB
254.0 tok/s131K ctx
SMistral Small 4 119B
llama.cppQ4_K_MRuns well
97.7 GB / 188.0 GB
275.4 tok/s256K ctx
Quick comparison
| Metric | NVIDIA B200 180GB | H100 NVL 188GB |
|---|---|---|
| Models that fit | 3 | 3 |
| Avg decode tok/s | 220.2 | 207.0 |
| Best grade score | 97 | 96 |
