Compare
Compare local AI hardware with workload-aware output.
NVIDIA GH200 96GB wins for coding in balanced mode
Based on model fit, speed, and quality across top recommendations.
NVIDIA GH200 96GB
WinnerSQwen3-Coder-Next
llama.cppq6-kRuns well
77.6 GB / 96.0 GB
170.9 tok/s217K ctx
SQwen 3.5 122B A10B
llama.cppq4-k-mTight fit
87.4 GB / 96.0 GB
130.3 tok/s73K ctx
SGemma 4 31B
llama.cppq8-0Runs well
58.0 GB / 96.0 GB
70.8 tok/s58K ctx
Quick comparison
| Metric | NVIDIA GH200 96GB | RTX 4090 24GB |
|---|---|---|
| Models that fit | 3 | 3 |
| Avg decode tok/s | 124.0 | 37.9 |
| Best grade score | 97 | 93 |
