Compare
Compare local AI hardware with workload-aware output.
H100 NVL 188GB wins for coding in balanced mode
Based on model fit, speed, and quality across top recommendations.
H100 NVL 188GB
WinnerSDevstral 2 123B Instruct
llama.cppq6-kRuns well
125.9 GB / 188.0 GB
71.5 tok/s201K ctx
SQwen 3.5 122B A10B
llama.cppq8-0Runs well
152.7 GB / 188.0 GB
159.3 tok/s131K ctx
SMistral Small 4 119B
llama.cppq8-0Runs well
152.4 GB / 188.0 GB
172.7 tok/s122K ctx
Quick comparison
| Metric | H100 NVL 188GB | RTX 4090 24GB |
|---|---|---|
| Models that fit | 3 | 3 |
| Avg decode tok/s | 134.5 | 37.9 |
| Best grade score | 99 | 93 |
