VOOZH about

URL: https://felloai.com/qwen-pricing/

⇱ Qwen Pricing 2026: Free App & API Costs Explained


Qwen pricing splits into two very different paths, and most people only need the free one. The Qwen Chat app is completely free with no subscription, while developers pay per token through the API, where rates run from $0.05 per million input tokens on the cheapest model up to $3.75 per million output tokens on the flagship Qwen3.7-Max. That roughly 75x spread between the cheapest and the flagship rate is the single most important thing to understand before you choose a model.

This guide breaks down every part of Qwen pricing for 2026, including the free Qwen Chat app, the pay-as-you-go API rates for each model, the input-size tiers that quietly raise some prices, the free token trial for new developers, and how Qwen stacks up against DeepSeek and Claude on cost. We also cover the change that caught many developers off guard, the end of the free API tier on April 15, 2026, and where Qwen still gives you the best value.

The Key Takeaways

  • Qwen Chat is free, with no consumer subscription like ChatGPT Plus or Claude Pro.
  • API pricing runs from $0.05 / $0.40 per million tokens (Qwen-Flash) to $1.25 / $3.75 (Qwen3.7-Max on a 50% promo; list price $2.50 / $7.50).
  • The mid-tier Qwen-Plus starts at $0.40 / $1.20 per million tokens, the best balance of price and capability for most apps.
  • Several Qwen models are tiered by input size, so the headline figure is a “from” price that climbs on long prompts (Qwen3-Max runs $1.20 to $3.00 input across its 262K context).
  • The free developer API tier ended April 15, 2026, replaced by a one-time trial of 1 million tokens per model (about 70 million total) for new accounts.

Qwen Pricing at a Glance

Qwen is Alibaba’s family of AI models, and you can use it in two ways. The consumer Qwen Chat app at chat.qwen.ai is free, and the developer API is billed per token through Alibaba Cloud Model Studio (the platform formerly branded DashScope). There is no flat monthly consumer plan in between, which makes Qwen unusual next to Western rivals. Qwen’s open base models also power third-party fine-tunes, including Rio 3.5 Open 397B from Rio de Janeiro’s city government.

Here is what the main text models cost on the API, priced per million tokens. Input is what you pay for the text you send in, and output is what you pay for the text the model generates back.

ModelInput ($/1M)Output ($/1M)Context windowBest for
Qwen-Flashfrom $0.05from $0.401M tokensHigh-volume, simple tasks
Qwen-Plusfrom $0.40from $1.201M tokensThe everyday workhorse
Qwen3.7-Plusfrom $0.32from $1.281M tokensMultimodal apps
Qwen3-Maxfrom $1.20from $6.00262K tokensHard reasoning, coding
Qwen3.7-Max$1.25$3.751M tokensFlagship, top benchmarks

The flagship Qwen3.7-Max rate of $1.25 / $3.75 is a 50% promotional discount; the standard list price is $2.50 / $7.50 per million tokens. That promo is the reason the flagship currently undercuts its own predecessor, Qwen3-Max, on output cost, but it is time-limited, so budget around the list price for anything long-term.

One thing to know about reading these numbers, Model Studio tiers several rates by the input size of each request, so the “from” figures above are the entry-tier rates and your cost rises on longer prompts. Always confirm the live rate for your model and region in the Alibaba Cloud Model Studio pricing docs before you build a budget.

Is Qwen Free?

Yes. The Qwen Chat app at chat.qwen.ai is free to use with no subscription, much like the free tier of ChatGPT or Gemini. You get access to Qwen’s strong models for everyday chat, writing, coding help, and image tasks without paying anything, and Alibaba has clear commercial reasons to keep the consumer app free.

Paid pricing only kicks in when you build on the Qwen API through Alibaba Cloud Model Studio, which charges per token. So if you just want to talk to Qwen, you pay nothing. If you want to plug Qwen into your own product, you pay by usage.

Qwen API Pricing by Model

Below we broke down API pricing by each of Qwen’s models so you can match the right tier to your workload and budget.

Qwen-Flash and Qwen-Turbo: From $0.05 / $0.40

Qwen-Flash is the budget tier at from $0.05 input / $0.40 output per million tokens for prompts up to 256K, rising to $0.25 / $2.00 on longer inputs. It is built for speed and high volume, so it suits classification, simple Q&A, tagging, and any task where you send a lot of requests and do not need flagship-level reasoning. Qwen-Flash now replaces the older Qwen-Turbo, which Alibaba has flagged as no longer being updated, so new projects should default to Flash. At these rates you can process millions of tokens for a few dollars.

Qwen-Plus: From $0.40 / $1.20

Qwen-Plus sits in the middle at from $0.40 input / $1.20 output per million tokens and is the model most apps should start with. It is tiered by prompt length, so the rate climbs toward $1.20 input / $3.60 output once a single request pushes past 256K tokens, and thinking mode raises output further. It handles the bulk of real work, summarizing, drafting, coding assistance, and retrieval tasks, at a price that stays reasonable at scale.

Qwen3.7-Plus: From $0.32 / $1.28

Qwen3.7-Plus is Qwen’s multimodal mid-tier, accepting text, image, and video input. It runs from $0.32 input / $1.28 output per million tokens, a 20% discount on the $0.40 / $1.60 list price, and like Qwen-Plus it is tiered upward on long context. If you are building apps that mix images or video with text and want to stay cheap, this is the tier to test before you jump to a Max model.

Qwen3-Max: From $1.20 / $6.00

Qwen3-Max is the heavy-reasoning workhorse, priced from $1.20 input / $6.00 output per million tokens on prompts up to 32K. Cost climbs in brackets, $2.40 / $12 from 32K to 128K and $3.00 / $15 up to its limit, so long-context jobs cost noticeably more than the headline rate suggests. Note that Qwen3-Max tops out at 262K tokens of context, not the full 1M that the smaller Qwen models reach.

Qwen3.7-Max: $1.25 / $3.75 (Flagship)

Qwen3.7-Max is the flagship and the one to reach for on the hardest reasoning, math, and software-engineering tasks. It runs $1.25 input / $3.75 output per million tokens on a 50% launch promotion, half the $2.50 / $7.50 list price, with a full 1M-token context window. The discount is time-limited, so plan around the list price if you are budgeting months ahead. Qwen3.7-Max scored 56.6 on the Artificial Analysis Intelligence Index and 80.4 on SWE-Verified, landing it in the global top 10 at launch. For the full benchmark breakdown, see our Qwen3.7-Max review.

The Free Tier Change Developers Need to Know

Qwen used to offer a standing free API tier, and that ended on April 15, 2026. The free developer OAuth tier and the free Qwen Code CLI access were both removed on the same date, so older tutorials that promise “free Qwen API access” are now out of date.

What replaced it is a one-time onboarding trial. New Alibaba Cloud Model Studio accounts get 1 million free tokens per model across roughly 70 proprietary Qwen models, so about 70 million tokens in total, valid for 90 days and only on the Singapore endpoint. That is generous for prototyping, but it is a trial, not a permanent free plan, so plan your budget for when it runs out.

How to Cut Your Qwen Bill

Two built-in features lower real costs well below the headline rates. The asynchronous batch API processes non-urgent jobs at roughly 50% off standard rates, which is ideal for bulk summarization, data labeling, or overnight processing where you do not need an instant reply.

Prompt caching is the second lever. When you reuse the same long prompt or system context across many requests, cached input reads cost only a small fraction of the normal input rate, in some cases around 10% of it. For chatbots and agents that repeat a large system prompt on every call, this is the difference between a workable bill and an expensive one.

Qwen vs DeepSeek, ChatGPT, and Claude on Cost

Qwen’s appeal is price-to-performance, and it competes directly with other low-cost models. Against DeepSeek, another Chinese open-weight family known for aggressive pricing, Qwen-Plus and Qwen-Flash sit in the same budget bracket, so the choice comes down to which model performs better on your specific tasks. Another open-weight Chinese model worth watching is Nex-N2-Pro, which builds on Qwen’s own 397B architecture and targets agentic coding. All of them sit far below US flagship pricing.

ModelInput ($/1M)Output ($/1M)ContextBest for
Qwen-Plusfrom $0.40from $1.201MCheap everyday workhorse
Qwen3.7-Max$1.25$3.751MQwen’s flagship reasoning
DeepSeek V4$0.435$0.871MClosest budget rival
Claude Opus 4.8$5.00$25.00200KUS flagship benchmark

Compared with ChatGPT and Claude, Qwen is dramatically cheaper at the API level. Claude Opus 4.8 lists at $5 / $25 per million tokens, so even Qwen’s flagship Qwen3.7-Max comes in at roughly a seventh of the output cost, and Qwen-Plus delivers strong results for cents. DeepSeek V4 undercuts the whole field at $0.435 / $0.87. If raw cost per token is your deciding factor, Qwen and DeepSeek win; if you need the absolute top of the benchmark charts or specific ecosystem features, the picture is closer. You can see where the US labs stand in our breakdown of how Claude and GPT compare, and check current promos in our guide to AI free trials, or compare it with our Mistral AI pricing guide. If you want Qwen alongside other models in one place, aggregators like OpenRouter also resell Qwen on pay-as-you-go terms.

A Simpler Alternative to Per-Token Billing

Per-token API pricing is powerful, but it is overkill if you just want to use top AI models day to day. Tracking input and output rates across models, watching input-size tiers push your bill up, seeing your free trial expire, and managing separate bills for each provider adds friction that most people do not want.

This is where Fello AI fits. For one flat $9.99 per month, you get access to many leading models in a single Mac app, including Claude, ChatGPT, Gemini, Grok, and DeepSeek, with no per-token math and no juggling multiple subscriptions. Instead of metering every request, you pick the best model for the job and switch freely. If you want flagship-grade AI without API billing, that one-price, many-models setup is far simpler; you can compare the lineup in our roundup of the best AI models, and see other budget options in our MiniMax pricing guide.

Conclusion

For most people, Qwen is effectively free, the Qwen Chat app costs nothing and covers everyday use. For developers, Qwen is one of the best value APIs available, from $0.05 per million tokens on Qwen-Flash up to a flagship that still undercuts US rivals by a wide margin. Just remember the free API tier is gone as of April 2026, several rates are tiered by prompt length, and the flagship discount is temporary, so build your budget around list prices and lean on batch processing and prompt caching to keep costs down. If you would rather skip per-token billing entirely, a flat-rate multi-model app like Fello AI gives you Qwen-class value alongside Claude, ChatGPT, and Gemini for one monthly price.

FAQ

Share Now!

Facebook
X
LinkedIn
Threads
Email

Get Exclusive AI Tips to Your Inbox!

Stay ahead with expert AI insights trusted by top tech professionals!

Table of Contents

Get Fello AI: All-In-One AI Chatbot

All top AI models like GPT, Claude, Gemini, or Grok – in one app that works on Mac, iPhone, and iPad.
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.