Novita Pricing
Per-token pricing for 68 open-source LLMs on Novita AI — a low-cost serverless inference provider with a broad model catalog and per-model quantization options.
Last updated: Jun 28, 2026
Novita offers free models
$0.50 — see the full Novita free tier guide
Provider | Model | Context | Quant | Input/1M | Output/1M |
|---|---|---|---|---|---|
262k | — | $0.010 | $0.030 | ||
16k | fp8 | $0.020 | $0.050 | ||
60k | fp8 | $0.040 | $0.170 | ||
131k | fp4 | $0.040 | $0.150 | ||
NV | 262k | fp4 | $0.050 | $0.200 | |
131k | fp4 | $0.050 | $0.250 | ||
S1 | 8k | bf16 | $0.050 | $0.050 | |
QW | 160k | fp8 | $0.070 | $0.270 | |
Z | 200k | bf16 | $0.070 | $0.400 | |
NV | inclusionAI: Ling-2.6-1T | 262k | — | $0.075 | $0.625 |
NV | inclusionAI: Ring-2.6-1T | 262k | — | $0.075 | $0.625 |
QW | 131k | fp8 | $0.080 | $0.500 | |
QW | 131k | fp8 | $0.090 | $0.580 | |
98k | bf16 | $0.119 | $0.200 | ||
262k | bf16 | $0.130 | $0.400 | ||
Z | 131k | bf16 | $0.130 | $0.850 | |
131k | bf16 | $0.135 | $0.400 | ||
DS | 1049k | fp8 | $0.140 | $0.280 | |
262k | bf16 | $0.140 | $0.400 | ||
QW | 131k | bf16 | $0.150 | $1.500 | |
QW | 131k | bf16 | $0.150 | $1.500 | |
131k | bf16 | $0.180 | $0.590 | ||
QW | 262k | fp8 | $0.200 | $1.500 | |
QW | 131k | bf16 | $0.200 | $0.700 | |
QW | 131k | fp16 | $0.200 | $1.000 | |
262k | — | $0.200 | $1.150 | ||
DS | 164k | fp8 | $0.269 | $0.400 | |
DS | 164k | fp8 | $0.270 | $1.120 | |
DS | 131k | fp8 | $0.270 | $1.000 | |
DS | 131k | fp8 | $0.270 | $1.000 | |
DS | 164k | fp8 | $0.270 | $0.410 | |
1049k | fp8 | $0.270 | $0.850 | ||
MM | 205k | fp8 | $0.270 | $1.080 | |
MM | 205k | fp8 | $0.300 | $1.200 | |
MM | 205k | fp8 | $0.300 | $1.200 | |
MM | 205k | fp8 | $0.300 | $1.200 | |
MM | 1000k | — | $0.300 | $1.200 | |
QW | 131k | fp8 | $0.300 | $3.000 | |
QW | 262k | bf16 | $0.300 | $2.400 | |
QW | 131k | bf16 | $0.300 | $1.500 | |
Z | 131k | bf16 | $0.300 | $0.900 | |
QW | 32k | bf16 | $0.380 | $0.400 | |
QW | 262k | fp8 | $0.380 | $1.550 | |
DS | 64k | fp8 | $0.400 | $1.300 | |
QW | 262k | bf16 | $0.400 | $3.200 | |
BD | 123k | fp16 | $0.420 | $1.250 | |
1049k | — | $0.522 | $1.044 | ||
Z | 205k | fp8 | $0.540 | $1.980 | |
MM | 1000k | bf16 | $0.550 | $2.200 | |
Z | 205k | bf16 | $0.550 | $2.200 | |
131k | fp8 | $0.570 | $2.300 | ||
262k | — | $0.570 | $2.850 | ||
262k | fp8 | $0.600 | $2.500 | ||
262k | bf16 | $0.600 | $2.500 | ||
QW | 262k | — | $0.600 | $3.600 | |
Z | 66k | fp8 | $0.600 | $1.800 | |
66k | bf16 | $0.620 | $0.620 | ||
DS | 64k | fp8 | $0.700 | $2.500 | |
DS | 164k | fp8 | $0.700 | $2.500 | |
DS | 8k | bf16 | $0.800 | $0.800 | |
262k | — | $0.800 | $3.400 | ||
262k | — | $0.950 | $4.000 | ||
QW | 131k | bf16 | $0.980 | $3.950 | |
Z | 203k | fp8 | $1.000 | $3.200 | |
Z | 1049k | fp8 | $1.260 | $3.960 | |
Z | 205k | fp8 | $1.380 | $4.400 | |
S1 | 8k | fp8 | $1.480 | $1.480 | |
DS | 1049k | fp8 | $1.600 | $3.200 |
Pricing for Novita endpoints aggregated via OpenRouter. Prices per 1M tokens.
About Novita AI
Novita AI is a serverless inference cloud offering cheap, pay-per-token access to a broad catalog of open-source models — Llama, Qwen, DeepSeek, GLM, gpt-oss and more. Many models are offered at multiple quantizations (fp8, fp16, fp4), letting you trade a little quality for lower cost. The API is OpenAI-compatible, so switching is usually a base-URL change.
Beyond LLMs, Novita also runs GPU instances and image/video models. For most teams the serverless per-token tier below is the cheapest way to run open-weight models, and one of the widest catalogs of any single provider.
Novita Pricing FAQ
How much does Novita cost?
Novita charges per token, with input prices starting around $0.01–$0.10 per million tokens for smaller open models. See the live table above for current per-model input and output prices.
What is quantization and why does it matter?
Quantization (fp8, fp4, int4) compresses a model so it runs cheaper and faster with a small quality trade-off. Novita serves many models at multiple quantizations — the table shows which quantization each price applies to.
Is Novita cheaper than other providers?
For open-weight models Novita is among the cheapest serverless options, especially at fp8/fp4 quantizations. Use the comparison links below to see Novita vs other providers on the same model.
Which models does Novita support?
A wide catalog of open-source LLMs plus image and video models. The table above lists the LLMs Novita currently serves with public per-token pricing.
