Compare
Compare local AI hardware with workload-aware output.
Operating mode: Balanced. Balanced for general local use. Keeps the ranking neutral across personal and serving workflows.
RTX 3090 24GB wins for coding in balanced mode
Based on model fit, speed, and quality across top recommendations.
Tesla P40 24GB
SDevstral Small 2 24B Instruct
llama.cppQ4_K_MTight fit
20.4 GB / 24.0 GB
15.0 tok/s40K ctx
SCodestral 2 25.08
llama.cppQ4_K_MRuns well
19.2 GB / 24.0 GB
14.6 tok/s48K ctx
SQwen 3.6 27B
llama.cppQ4_K_MTight fit
20.7 GB / 24.0 GB
10.2 tok/s69K ctx
RTX 3090 24GB
WinnerSDevstral Small 2 24B Instruct
llama.cppQ4_K_MTight fit
20.4 GB / 24.0 GB
36.7 tok/s40K ctx
SCodestral 2 25.08
llama.cppQ4_K_MRuns well
19.2 GB / 24.0 GB
38.2 tok/s48K ctx
SQwen 3.6 27B
llama.cppQ4_K_MTight fit
20.7 GB / 24.0 GB
19.6 tok/s69K ctx
Quick comparison
| Metric | Tesla P40 24GB | RTX 3090 24GB |
|---|---|---|
| Models that fit | 3 | 3 |
| Avg decode tok/s | 13.3 | 31.5 |
| Best grade score | 89 | 92 |
