![]() |
VOOZH | about |
DeepSeek pricing in 2026 spans four tiers: a free web chat at chat.deepseek.com, the flagship DeepSeek V4 Flash API at $0.14/$0.28 per million tokens, DeepSeek V4 Pro at $0.435/$0.87 per million tokens, and a free API grant of 5 million tokens for every new developer account. All V4 models ship with a native 1 million token context window at no extra charge, making DeepSeek roughly 35 to 100 times cheaper per token than GPT-5.5 or Claude Opus 4.8.
This guide covers every part of DeepSeek pricing in 2026: the new V4 Flash and V4 Pro API rates, the legacy V3 chat and R1 reasoner costs, the free tier and the off-peak discount window. We also cover OpenRouter and AWS Bedrock pricing, plus how DeepSeek stacks up against ChatGPT, Claude and Gemini. If you just want a single subscription that bundles DeepSeek with the other top models, we cover that too at the end.
- DeepSeek V4 Flash costs $0.14/M input (cache miss) and $0.28/M output with a 1M token context window.
- DeepSeek V4 Pro costs $0.435/M input and $0.87/M output, after a permanent 75% price cut.
- DeepSeek’s web chat is free for individual users with no Plus or Pro plan.
- The DeepSeek API gives 5 million free tokens to every new account.
- Cache hits cost a fraction of cache misses, so prompt caching can cut bills by 80% or more.
- Off-peak hours (16:30 to 00:30 UTC) historically gave 50 to 75% extra discounts on V3 and R1.
DeepSeek pricing is pay-per-token: you are charged per million tokens of text the model reads (input) and writes (output). There are no monthly subscriptions on the API and no per-seat fees. Current rates start at $0.14 per million input tokens for DeepSeek V4 Flash and rise to a few dollars per million for the high-end V4 Pro model.
DeepSeek deliberately undercuts every Western lab. Its closest rival on price is Alibaba’s Qwen, broken down in our Qwen pricing guide. As we covered in our DeepSeek V4 launch breakdown, the company’s stated goal is rock-bottom prices, backed by Huawei Ascend and Cambricon chips. That keeps DeepSeek pricing roughly 35 to 100 times cheaper per token than the equivalent OpenAI or Anthropic models.
DeepSeek pricing in 2026 covers four tiers: a free web chat, two API models (V4 Flash and V4 Pro), and a free 5-million-token grant for new developer accounts. All figures below come from the official DeepSeek pricing page and are in USD per 1,000,000 tokens.
| Plan | Price | Models | Context / limits | Best for |
|---|---|---|---|---|
| Free web chat | $0 | V4 Flash (non-thinking + thinking) | Fair-use throttling during peak hours | Individuals chatting at chat.deepseek.com or in the mobile app |
| V4 Flash API | $0.14 in / $0.28 out per M tokens | deepseek-v4-flash | 1M context, 384K max output | Cheap general-purpose API workhorse |
| V4 Pro API | $0.435 in / $0.87 out per M tokens | deepseek-v4-pro | 1M context, 384K max output | Frontier reasoning at a fraction of rivals’ cost |
| Free API grant | $0 for 5M tokens / 30 days | All DeepSeek models | Standard rate limits, no credit card | New developer accounts evaluating the API |
The DeepSeek web chat at chat.deepseek.com and the official mobile app are completely free for individual users. There is no Plus plan, no Pro tier, and no paywall on file uploads or long conversations. The only catch is fair-use throttling, so during peak hours you may see “Server Busy” warnings. Free chat sessions run on V4 Flash by default and let you toggle DeepThink to switch into the V4 Flash thinking mode.
DeepSeek V4 Flash is the default low-cost API model. It costs $0.14 per million input tokens on a cache miss and $0.28 per million output tokens, with cache hits reduced to just $0.0028 per million (a 98% discount). V4 Flash supports both a non-thinking and a thinking mode under the deepseek-v4-flash ID and ships with a 1 million token context window plus up to 384K tokens of output. This is the model to default to for everything that does not require deep reasoning.
DeepSeek V4 Pro is the flagship reasoning model. After a 75% price cut that DeepSeek has made permanent, it costs $0.435 per million input tokens (cache miss), $0.003625 per million cache hits and $0.87 per million output tokens. V4 Pro shares the same 1M context and 384K max output as V4 Flash, and the model ID is deepseek-v4-pro. At these rates it is the cheapest frontier reasoning model on the market.
Every new DeepSeek developer account starts with a 5 million token grant valid for 30 days, with no credit card required. That is enough for roughly 2,500 to 5,000 test calls depending on prompt length, worth about $8.40 at current V4 Flash rates. After the grant runs out, billing switches to the standard pay-per-token rates above and you can top up your balance with a small deposit.
| Model | Input (cache miss) | Input (cache hit) | Output | Context | Notes |
|---|---|---|---|---|---|
| deepseek-v4-flash | $0.14 | $0.0028 | $0.28 | 1M | Default low-cost model |
| deepseek-v4-pro | $0.435 | $0.003625 | $0.87 | 1M | Flagship reasoning model (permanent 75% cut) |
V3 era launch pricing (deepseek-chat) | $0.27 | $0.07 | $1.10 | 64K | Historic, ID now routes to V4 Flash non-thinking |
R1 era launch pricing (deepseek-reasoner) | $0.55 | $0.14 | $2.19 | 64K | Historic, ID now routes to V4 Flash thinking |
The pre-V4 deepseek-chat and deepseek-reasoner model IDs still work, but they now route to V4 Flash non-thinking and thinking modes for backward compatibility, and DeepSeek has flagged them for deprecation. New projects should target the V4 IDs directly.
DeepSeek V4 Flash is the cheap, fast workhorse. V4 Pro is the deep-thinking model with a much larger active parameter count. V4 Pro costs roughly 3 times more per token than Flash, so default to Flash and escalate to Pro only when a task genuinely needs reasoning depth. Even at 3x, V4 Pro is the cheapest frontier reasoning model on the market.
DeepSeek pricing has three layers and you save money on each one.
You only pay for what you use. There is no minimum spend, no monthly fee, and no charge for context window length itself.
Yes, DeepSeek has free options. The consumer chat experience at chat.deepseek.com and in the official mobile app is completely free for individual users. There is no Plus plan, no Pro tier, and no paywall on file uploads or long conversations. The only catch is fair-use throttling, so during peak hours you may see “Server Busy” warnings.
For developers, the API includes a 5 million token grant for new sign-ups, valid for 30 days. That is enough for 2,500 to 5,000 test calls depending on prompt length, worth roughly $8.40 at current V4 Flash rates. No credit card is required, and after the grant runs out billing switches to the standard pay-per-token rates above.
If you do not want to use DeepSeek’s own API, several third-party providers host the same models. Pricing varies because each platform adds its own margin, infrastructure cost and routing.
| Provider | Model | Input/M | Output/M | Notes |
|---|---|---|---|---|
| OpenRouter | DeepSeek V4 Pro | $0.435 | $0.87 | Matches DeepSeek’s direct API pricing |
| OpenRouter | DeepSeek V4 Flash | $0.14 | $0.28 | Matches direct API |
| OpenRouter | DeepSeek V3.2 | $0.252 | $0.378 | Mid-tier legacy option |
| OpenRouter | DeepSeek R1 (reasoning) | $0.70 | $2.50 | Original R1 chain-of-thought model |
| AWS Bedrock | DeepSeek V3.2 | $0.62 | $1.85 | Higher than direct, but enterprise-friendly |
| Azure AI Foundry | DeepSeek (various) | varies | varies | Pricing depends on region and SKU |
OpenRouter also offers several free DeepSeek distilled models including the smaller R1 distills and DeepSeek V3 Base, with rate limits but no token charges. They are great for prototyping.
This is one of the most-asked questions about DeepSeek pricing. The short answer: R1 reasons better but costs roughly 5 times more per query.
| Model | Input/M | Output/M | Best for |
|---|---|---|---|
DeepSeek V3 (legacy deepseek-chat) | $0.27 | $1.10 | General chat, summaries, writing |
DeepSeek R1 (legacy deepseek-reasoner) | $0.55 | $2.19 | Math, logic, multi-step reasoning |
| DeepSeek V4 Flash | $0.14 | $0.28 | Replaces V3, cheaper and bigger context |
| DeepSeek V4 Pro | $0.435 | $0.87 | Replaces R1, frontier reasoning |
R1 generates many more output tokens per request because it produces a chain-of-thought before its final answer. That widens the price gap in practice. With V4, both Flash and Pro support thinking and non-thinking modes, so you can pick reasoning depth on a per-call basis. For more on the original reasoner, see our full breakdown of DeepSeek-R1 and how it beat OpenAI.
| Model | Input/M | Output/M | DeepSeek V4 Flash multiplier |
|---|---|---|---|
| DeepSeek V4 Flash | $0.14 | $0.28 | 1x |
| DeepSeek V4 Pro | $0.435 | $0.87 | ~3x |
| GPT-5.5 | $5.00 | $30.00 | 35-100x more expensive |
| Claude Opus 4.8 | $5.00 | $25.00 | 35-90x more expensive |
| Gemini 3.1 Pro | $2.00 (under 200K context) | $12.00 | 14-43x more expensive |
DeepSeek is not always the smartest model on a benchmark, but on price per useful token nothing else gets close. For high-volume workloads like content pipelines, summarization, classification and RAG over big corpora, DeepSeek is often the only model that makes the unit economics work. For the full side-by-side including monthly subscription tiers, see our complete AI pricing comparison. And for a head-to-head on capabilities, accuracy and where each model wins, read our DeepSeek vs ChatGPT comparison.
A few quick wins can cut a DeepSeek bill by 50 to 90 percent.
max_tokens to the smallest value that still works.Most readers do not actually want to write code against an API. They want to chat with the best model for the task without juggling five subscriptions, five web tabs and five different memory contexts.
Fello AI is the simplest way to do exactly that. One subscription gets you DeepSeek, ChatGPT, Claude, Gemini, Grok, Perplexity and several other top models in a single Mac, iPhone and iPad app for $9.99 per month. There is no per-token billing to track, no usage cap to budget around, and no separate account to manage for each provider. Everything lives in one chat window and one history.
DeepSeek is the cheapest serious model on the API, but the official web chat is throttled and the desktop experience is bare-bones. Fello AI fixes both. You get DeepSeek inside a real Mac-native app with prompt history, file uploads, image analysis, voice input and the ability to switch models mid-conversation when DeepSeek is not the right tool for the next message. If your prompt would be better answered by Claude’s writing or GPT’s reasoning, swap models without losing context.
Add up the standard subscription cost of the major models: ChatGPT Plus at $20/month, Claude Pro at $20/month, Gemini Advanced at $20/month, Perplexity Pro at $20/month and Grok at $30/month. That is roughly $110 per month for individual access. Fello AI compresses the same lineup into $9.99 per month, with DeepSeek thrown in as part of the bundle. With 25,000+ five-star reviews across the App Store and Mac App Store, it is the most-loved AI bundle app in the Apple ecosystem.
If you would rather keep DeepSeek separate from the others, we also have a guide to a dedicated DeepSeek desktop client for Mac.
DeepSeek pricing in 2026 is the cheapest at the frontier. V4 Flash at $0.14/$0.28 per million tokens beats every Western lab on cost by an order of magnitude. V4 Pro, now permanently 75% off at $0.435/$0.87, lets you use a top reasoning model for under a dollar per million output tokens. The web chat stays free, the API hands out 5 million free tokens, and the cache and off-peak discounts can stack on top.
If you build software, go straight to the official DeepSeek pricing page and start with V4 Flash. If you just want to use DeepSeek alongside ChatGPT, Claude and Gemini in one place, Fello AI is the simplest route.
Stay ahead with expert AI insights trusted by top tech professionals!
Join thousands of AI fans & professionals benefiting from exclusive tips and insights from industry leaders.