Voozh

May 16, 2026

18 min read

DeepSeek launched the preview of its V4 frontier model on Friday, April 24, 2026, and the release detonated across the global AI market within hours. The Hangzhou-based lab dropped open weights for two Mixture-of-Experts variants – V4-Pro at 1.6 trillion parameters with 49B active per token and V4-Flash at 284B parameters with 13B active – under an MIT license, with a 1,000,000-token context window across both tiers. The same day, Huawei announced that its Ascend AI supernode offered “day zero” full support for DeepSeek V4, marking the first frontier model in history engineered to train and serve on Chinese silicon without Nvidia in the loop.

The combination broke through three barriers in a single product cycle. V4-Pro priced inference at $1.74 per million input tokens, a fraction of Claude Opus 4.6 and GPT-5.4. V4-Flash dropped further to $0.14 per million input tokens. And the hybrid attention architecture – Compressed Sparse Attention (CSA) plus Heavily Compressed Attention (HCA) – cut KV-cache memory at 1M context to just 10% of DeepSeek V3.2, while requiring only 27% of single-token inference FLOPs. For investors, this article unpacks what the launch means for Nvidia, Huawei, the open-weight ecosystem, and the geopolitical AI race.

DeepSeek V4 Preview: The April 24 Drop That Reset the Frontier

The release went live at 09:00 China Standard Time on April 24, 2026, via DeepSeek’s official API documentation portal and Hugging Face. The company shipped two open-weight models simultaneously: DeepSeek-V4-Pro, the flagship, and DeepSeek-V4-Flash, a cost-optimized sibling. Both are sparse Mixture-of-Experts designs, both ship with full open weights under MIT license, and both expose a 1,000,000-token context window – a tenfold jump over the 128K ceiling that defined the V3.2 generation released in late 2025.

The product page describes V4 as “the era of cost-effective 1M context length,” and the language is deliberate. While OpenAI, Anthropic, and Google had all introduced 1M-token tiers earlier in 2026, none of them made the long-context regime economically practical for high-volume agentic workloads. DeepSeek’s claim is that V4-Pro is the first frontier-class model where 1M-token reasoning is cheaper than 128K reasoning was on the prior generation, on a per-task basis. Internal benchmarks shared in the model card show V4-Pro-Max scoring 80.6% on SWE-bench Verified and 93.5 on LiveCodeBench, placing it within striking distance of the closed-source leaders.

The release pace mirrors the company’s playbook from January 2025, when DeepSeek-R1 briefly wiped $589 billion off Nvidia’s market cap in a single trading session. This time the Huawei tie-in changed the script – instead of forcing a sell-off, V4 catalyzed a rotation. SMIC, China’s largest foundry and a co-beneficiary of any Huawei silicon ramp, jumped roughly 10% in Hong Kong trading the day of the announcement, while Nvidia closed at $199.96, with the broader Nasdaq absorbing the news without panic. The market read: this is not a 2025 demand-destruction story – it is a 2026 supply-chain bifurcation story.

Architecture Breakdown: 1.6T Parameters, 49B Active, 1M Context

The headline number – 1.6 trillion parameters – overstates the per-token cost. DeepSeek V4-Pro activates only 49 billion parameters per forward pass, meaning inference compute is closer to a dense 49B model than a dense 1.6T monster. V4-Flash compresses further: 284B total parameters, 13B active. The MoE expert count for each variant has not been published in the preview model card, though community analysis suggests V4-Pro uses on the order of 256-512 routed experts plus shared experts, consistent with the design DNA of V3.

👁 Architecture Breakdown: 1.6T Parameters, 49B Active, 1M Context

The real architectural news is attention. V3.2 relied on Multi-head Latent Attention (MLA), DeepSeek’s signature KV-cache compression trick. V4 introduces a two-layer hybrid: Compressed Sparse Attention (CSA) handles long-range token selection via learned sparsity patterns, while Heavily Compressed Attention (HCA) applies extreme rank reduction to the local-window cache. The combined effect, documented in the model card, is that V4-Pro running a 1M-token prompt consumes 27% of the FLOPs and 10% of the KV cache memory required by V3.2 at the same context length. For agentic workloads that stream tool outputs into context for hours, this is the difference between an unprofitable demo and a profitable production product.

Training also got a refresh. V4 was pre-trained on 33 trillion tokens, up from 14.8T for V3.2, using the Muon optimizer in place of AdamW. Muon, originally proposed by Keller Jordan in 2024, applies orthogonalized matrix updates that converge faster on transformer weights; DeepSeek’s adoption at frontier scale is the most prominent production validation of the optimizer to date. The combined effect – bigger data, sparser compute, more efficient optimizer – produced an MMLU base score of 90.1 for V4-Pro, versus 87.8 for V3.2 and 88.7 for V4-Flash.

V4 vs V3.2: What Changed Under the Hood

Dimension	DeepSeek V3.2	DeepSeek V4-Pro	Change
Total parameters	671B	1.6T	+138%
Active parameters per token	37B	49B	+32%
Context window	128K tokens	1,000,000 tokens	7.8x
Attention design	MLA	Hybrid CSA + HCA	New
Optimizer	AdamW	Muon	New
Training tokens	14.8T	33T	2.23x
1M-context inference FLOPs	Baseline (100%)	27% of V3.2	-73%
1M-context KV cache	Baseline (100%)	10% of V3.2	-90%
MMLU (base)	87.8	90.1	+2.3 pts
License	MIT	MIT	Unchanged

Pricing Shock: $0.14 Input Tokens and the Margin Wave

The pricing card published with the preview is the document closed-frontier labs feared most. V4-Flash sells API access at $0.14 per million input tokens, while V4-Pro charges $1.74 per million input tokens. Even before factoring in output prices, V4-Pro lands roughly an order of magnitude below Anthropic’s Claude Opus 4.6 list price and a similar gap below the high-tier OpenAI GPT-5.4 endpoint. V4-Flash undercuts the cheapest mainstream US-hosted frontier endpoint by between 5x and 12x depending on the comparison.

The implications for inference-margin economics are stark. Anthropic and OpenAI have spent the past 18 months teaching enterprise procurement teams that frontier-quality output is worth $10-$30 per million tokens. A model that scores within a few points on SWE-bench Verified at $1.74 forces a conversation in every CTO’s office about whether quality differentials justify a 10x price premium. For agent-heavy use cases that burn billions of tokens per month – coding copilots, research assistants, customer-support automation – V4-Pro becomes the default sandbox model, with closed-source models reserved for the highest-stakes calls.

This is the second time in 16 months that DeepSeek has used a pricing card as a market weapon. In January 2025, the R1 launch slashed reasoning-model token economics by an order of magnitude and triggered the largest single-day market-cap loss in stock-market history. The 2026 version is different in two ways: V4 is not a research preview that requires reproduction work; the open weights are in production users’ hands the day of launch. And the Huawei compatibility means that customers under US export restrictions – Chinese enterprises, sovereign-AI buyers in the Gulf, parts of Southeast Asia – can now serve frontier-class inference on hardware no longer regulated under the BIS Advanced Computing rules.

Huawei Ascend Supernode: Day-Zero Frontier Support

Huawei’s announcement, published on Huawei Central and confirmed in South China Morning Post coverage, declared “full support” for DeepSeek V4 on the Ascend AI supernode platform from the moment of the model’s release. The phrase “day zero” matters: it means DeepSeek shared pre-release weights, kernel optimizations, and quantization recipes with Huawei’s compiler team weeks ahead of the public drop, rather than waiting for the open-source community to figure out Ascend deployment after the fact.

👁 Huawei Ascend Supernode: Day-Zero Frontier Support

Reuters reporting describes the relationship as one in which DeepSeek granted early access to domestic Chinese hardware partners, including Huawei, while withholding the new model from US chip vendors for the optimization tuning window. The implication: V4 is now the reference workload that the Ascend software stack is fastest at running, inverting the usual dynamic where Chinese accelerators retrofit support for US-trained models after launch.

The Ascend supernode is Huawei’s answer to Nvidia’s GB200 NVL72 rack-scale design. Public disclosures around the CloudMatrix architecture describe clusters built from hundreds of Ascend accelerators connected over a proprietary unified-bus fabric, addressing the bandwidth and memory-pooling requirements of trillion-parameter MoE inference. The April 24 announcement effectively pairs that hardware platform with a workload it can run at full utilization, giving Huawei a credible end-to-end story to pitch to state-owned enterprises, banks, and ministries in China and to sovereign-AI buyers in markets where US chip imports face friction.

Benchmark Scores: How V4 Stacks Up Against GPT-5.4 and Claude Opus 4.6

The model card published with the preview includes a benchmark grid covering the standard frontier evaluation suite. V4-Pro-Max – the highest-effort inference configuration – posts 80.6% on SWE-bench Verified, the gold-standard test of autonomous software-engineering capability, and 93.5 on LiveCodeBench, the contamination-resistant coding benchmark released by a research consortium in late 2025. On general-knowledge MMLU, the V4-Pro base model scores 90.1, with AGIEval at 83.1 and V4-Flash-Base trailing at 88.7 and 82.6 respectively.

These numbers place V4-Pro within roughly two points of the leading closed-source models on the most-cited public benchmarks, though independent reproductions are still rolling out as of mid-May 2026. The gap is small enough that, when paired with the 10x pricing differential and the open-weight license, the value comparison becomes unfavorable for closed-source vendors on any workload that does not strictly require the absolute frontier of capability. For deeper analysis of how V4 compares head-to-head against the leading US frontier models, see our companion piece on DeepSeek vs ChatGPT and our quarterly Anthropic vs OpenAI tracker.

Frontier Model Benchmark Table: V4 vs Closed-Source Rivals

Model	SWE-bench Verified	LiveCodeBench	MMLU	Context	Input price ($/M tokens)	License
DeepSeek V4-Pro-Max	80.6%	93.5	90.1 (base)	1,000,000	$1.74	MIT (open)
DeepSeek V4-Flash	n/a (preview)	n/a (preview)	88.7 (base)	1,000,000	$0.14	MIT (open)
Claude Opus 4.6	80.8%	~93	~89	200K	~$15	Proprietary
OpenAI GPT-5.4 (high)	~77%	~92	~88	400K	~$10-15	Proprietary
Google Gemini 3.0 Pro	~78%	~91	~88	2,000,000	~$7	Proprietary
DeepSeek V3.2	~71%	~85	87.8 (base)	128K	$0.27	MIT (open)

The price column is the line investors and procurement officers cannot ignore. At $1.74 per million input tokens, V4-Pro is selling output that is competitive with $15-per-million Claude Opus 4.6 – a list-price gap of roughly 8.6x for output that scores within 0.2 percentage points on SWE-bench Verified. Even when Anthropic and OpenAI grant enterprise discounts of 40-60%, the structural pricing gap to V4-Pro remains larger than the quality gap on most benchmarks the buyers actually care about.

Stock Market Reaction: SMIC Up 10%, Nvidia Holds at $199.96

Markets digested the April 24 launch with a precision that belied the dramatic framing in the press. Nvidia (NVDA) closed the session at $199.96, a small daily move that left the stock up roughly 13.4% on the month. The contrast with the January 2025 R1 episode – when NVDA shed nearly 17% in a single day on the original DeepSeek shock – is the central trading story. Investors interpreted V4 as evidence that the AI compute market is bifurcating, not collapsing. Demand for Nvidia’s Blackwell and upcoming Rubin platforms remains anchored by US hyperscalers and sovereign-AI buyers who still prefer the CUDA ecosystem; demand for Huawei Ascend now has a credible workload story for the Chinese and Chinese-adjacent markets that were already off-limits to Nvidia under the BIS Advanced Computing rules.

👁 Stock Market Reaction: SMIC Up 10%, Nvidia Holds at $199.96

Hong Kong told the bifurcation story more loudly. SMIC jumped approximately 10% on the news as traders priced in higher wafer demand for the 7nm-class Ascend silicon. Cambricon and Hua Hong rallied in sympathy in Shanghai trading. Conversely, shares in second-tier Chinese AI model vendors – including MiniMax and Knowledge Atlas, both of which compete with DeepSeek in the domestic foundation-model market – fell more than 9%, according to Fortune’s market roundup. The clear message: DeepSeek has consolidated its position as the de facto Chinese frontier-model national champion, and the rest of the domestic AI cohort just got squeezed.

The cross-currents for US-listed semiconductor names extended beyond Nvidia. Broadcom and Marvell, which sell custom AI accelerator IP to hyperscalers, traded essentially flat – their order books are tied to US cloud capex, which has not been affected by the V4 launch. AMD held its ground as investors priced the launch as a Nvidia-specific competitive event rather than a broader compute thesis change. For a deeper look at the underlying capex dynamics, see our analysis of Big Tech’s $650 billion AI infrastructure bet and the Nvidia Vera Rubin platform roadmap.

Expert Quotes: What Analysts Are Saying

The named-expert reaction crystallized quickly across Asia-Pacific and US sell-side desks. Su Lingyao, analyst at BOC International, told the South China Morning Post that the launch lowers the cost barrier for enterprise adoption: “DeepSeek’s V4 has lowered the threshold for using high-performance AI models and will offer more affordable AI capabilities to small and medium-sized enterprises or even individuals. DeepSeek’s V4 is also highly compatible with domestically made chips, and that will accelerate the commercialisation of AI computing power in China.”

Jensen Huang, founder and CEO of Nvidia, framed the strategic stakes in a widely circulated remark: “The best AI researchers in the world, because they are limited in compute, also come up with extremely smart algorithms.” Huang separately observed that “the day that DeepSeek comes out on Huawei first, that is a horrible outcome for the U.S.” The quote, delivered before the April launch but circulated heavily in the wake of the V4 drop, captures the export-control dilemma now confronting Washington: restrictions intended to slow Chinese AI progress have instead created the demand pull that Huawei needed to harden its software stack.

The Counterpoint Research principal AI analyst characterized the model in coverage as offering “excellent agent capability at significantly lower cost,” the talking point that procurement teams across Asia have seized on. Omdia’s analyst commentary, paraphrased in subsequent TechRepublic coverage, observed that the V4 partnership demonstrates DeepSeek models can deliver similar performance on both Huawei and Nvidia hardware – eliminating the lock-in argument that had previously favored the CUDA stack in mixed-fleet environments. For US sell-side perspective, Bernstein semiconductor research has maintained that Nvidia’s near-term growth thesis remains intact because the addressable buyer pool in China was already restricted under the BIS rules; what V4 does is harden Huawei’s position in that already-segregated market, not steal customers from Nvidia in the West.

The Open-Weight Trojan Horse: MIT License at 1.6T Parameters

V4 ships under the MIT license, the most permissive widely-used software license, the same regime DeepSeek used for V3 and R1. This matters in three ways. First, any organization can take the weights, modify them, and redeploy commercially without royalty or attribution requirements beyond a courtesy line in the LICENSE file. Second, sovereign-AI projects – from Riyadh to Paris to Jakarta – gain a frontier-class base model they can fine-tune on national data without negotiating bilateral terms with a US lab. Third, the open weights become the new floor against which closed-source labs must justify their pricing premium on a per-task basis.

The strategic logic resembles the Android playbook of the 2010s. By giving away the model, DeepSeek captures distribution, attracts external safety and capability research at zero cost, and forces competitors into a value proposition that must clear a much higher bar. Where Google used Android to commoditize Apple’s iOS revenue, DeepSeek is using MIT-licensed V4 to commoditize the proprietary frontier API margin. The recipient of the value is anyone running large-scale inference workloads.

For US labs, the strategic response options narrow. Closing further is increasingly difficult – open-weight model quality is now within striking distance of the closed frontier, and customers can self-host. Opening up means surrendering the gross-margin profile that justifies the capex; OpenAI’s most recent funding round at $852 billion was priced off the assumption that frontier model access remains a controlled scarcity. V4 is the most direct attack to date on that assumption.

US Export Controls and the Compute-Independence Narrative

The geopolitical framing is impossible to ignore. The Biden-era October 2022 and October 2023 export-control packages tightened the flow of advanced AI accelerators from the United States to China. The successor Trump administration revoked the proposed AI Diffusion Rule in early 2025 but maintained the country-specific controls on H100, H200, B100, and B200 sales to China, leaving Nvidia with only the cut-down H20 and a handful of other compliant SKUs to sell into the Chinese market.

👁 US Export Controls and the Compute-Independence Narrative

The intent of those controls was to slow Chinese AI capability development by capping access to leading-edge compute. The unintended consequence has been to create a guaranteed buyer for Huawei’s Ascend roadmap. For three years, Chinese AI labs trained on a mix of stockpiled Nvidia silicon, cut-down H20s, and progressively maturing Ascend systems. V4 is the data point that says the substitute has reached production grade. Once Huawei demonstrates it can serve a 1.6T-parameter MoE model at frontier latency, the political argument for further restricting Nvidia sales to China weakens – there is less to restrict if the buyers no longer need Nvidia. Conversely, the argument for restricting Huawei’s access to foreign foundry capacity and EDA tools intensifies, because that is the remaining chokepoint.

The Bureau of Industry and Security’s Advanced Computing rules continue to govern the legal contours of the trade. Whether the V4 launch triggers a tightening of the entity-list scope, a re-tightening of the foundry-tool rules, or a renewed push for an AI Diffusion-style framework will be one of the dominant policy questions through the rest of 2026.

Liang Wenfeng and the High-Flyer Origin Story

DeepSeek was founded by Liang Wenfeng in May 2023 as the AI research arm of High-Flyer Quant, the Hangzhou-based hedge fund Liang co-founded in 2015. High-Flyer had spent years building a multi-thousand-GPU cluster for quantitative trading research, and that compute base – assembled when Nvidia A100 and H100 supply was easier to source for Chinese buyers – became the foundation for DeepSeek’s training runs. Liang’s public posture has been deliberately quiet: he gives few interviews, makes no marketing claims, and lets the open weights speak for the lab’s capabilities. The V4 launch fits that pattern. There was no keynote, no demo video, no influencer push – just a model card, a pricing page, and the Hugging Face checkpoint.

The company raised its first external funding round in early 2026 at a reported $10 billion valuation after two years of being self-funded by High-Flyer. The round, detailed in our coverage of the DeepSeek $300 million raise, included a mix of domestic strategic investors and validated the lab’s transition from a hedge-fund research arm to a standalone AI company. V4 is the first major release since that round closed, and the open-weight + low-price strategy suggests the new investors have aligned with Liang’s view that distribution beats margin in the current phase of the market.

Hyperscaler Reactions: What AWS, Azure, and Google Cloud Do Next

Within 48 hours of the V4 launch, all three US hyperscalers had moved to add DeepSeek V4 to their managed-model catalogs. AWS Bedrock, Azure AI Foundry, and Google Vertex AI now offer V4-Pro and V4-Flash under their pay-per-token meters, with regional availability in North America, Europe, and select Asia-Pacific markets. The economics for the hyperscalers are straightforward: an open-weight frontier model at this price point is a customer-retention tool – if a CTO can run V4-Pro on AWS Bedrock, that CTO is less likely to migrate to a self-hosted setup or a Chinese-cloud-hosted alternative.

For Anthropic and OpenAI, the hyperscaler-hosting move is double-edged. On the one hand, V4 sits beside Claude and GPT-5 on the same procurement page, making the price-quality comparison unavoidable. By contrast, hosting V4 in US-jurisdiction clouds gives Western customers a way to use the model without data sovereignty concerns about China-hosted inference. Whether enterprise customers actually shift token volume from Claude and GPT to V4 in the second half of 2026 will be the single most important leading indicator of whether the open-weight commoditization thesis plays out.

Code Example: Calling V4-Pro via the DeepSeek API

The V4 API maintains compatibility with the OpenAI-style chat-completions schema that V3 used, lowering the migration cost for existing applications. A minimal Python call against V4-Pro looks as follows:

👁 Code Example: Calling V4-Pro via the DeepSeek API

from openai import OpenAI

client = OpenAI(
 api_key="YOUR_DEEPSEEK_KEY",
 base_url="https://api.deepseek.com/v1",
)

response = client.chat.completions.create(
 model="deepseek-v4-pro",
 messages=[
 {"role": "system", "content": "You are a senior software engineer."},
 {"role": "user", "content": "Refactor this 200K-token codebase for clarity."},
 ],
 max_tokens=8192,
 temperature=0.2,
)

print(response.choices[0].message.content)

The same code, with the model identifier swapped to deepseek-v4-flash, drops the per-token cost by an order of magnitude – useful for high-volume background workflows where latency and price matter more than absolute capability. For agentic deployments that need to stream millions of tokens through context, the Hugging Face checkpoint can be self-hosted on Nvidia H200 or H100 clusters in the West, or on Huawei Ascend supernodes in markets where that is the available silicon.

Five Predictions: Where the V4 Shockwave Travels Next

Prediction 1: Closed-source frontier pricing falls 30-50% by Q4 2026. Anthropic and OpenAI cannot hold $15-per-million-token list prices against an open-weight competitor at $1.74 with comparable benchmark scores. Expect a series of “Pro” and “Lite” tier reshuffles that effectively cut the headline price while preserving margin on the highest-effort inference configurations.

Prediction 2: At least one US lab open-weights a near-frontier model by year-end 2026. Meta’s Llama family has been the closest US analog, but the gap between open-weight Llama and the closed-source frontier has widened in 2026. V4 makes the strategic argument for a US response unavoidable; the most likely candidate is Meta releasing a 1T-parameter MoE under the Llama Community License.

Prediction 3: Huawei Ascend revenue doubles year-over-year in 2026. The day-zero V4 support is the reference workload Huawei needed to convert pilot deployments at Chinese state-owned enterprises, banks, and ministries into full production rollouts. Expect Ascend revenue to be the standout disclosure in Huawei’s next annual report.

Prediction 4: Sovereign-AI projects standardize on open-weight V4 or successors. The UK, France, Saudi Arabia, UAE, India, and Japan all have public sovereign-AI initiatives. None of them wants to be dependent on a single US lab. V4-class open weights are the obvious base layer, with national fine-tuning on top. Expect at least three sovereign-AI vendors to announce V4-based deployments by September 2026.

Prediction 5: The next round of US export controls targets Huawei’s foundry access, not Nvidia’s chip sales. With Huawei now demonstrably able to serve frontier inference workloads, the policy debate shifts from “restrict Nvidia exports to China” to “restrict foundry capacity available to Huawei.” This is a politically harder lever to pull because it implicates TSMC and SMIC’s own supply chain to the US-aligned ecosystem, but it is the logical chokepoint.

What This Means for Builders and CTOs

For engineering leaders weighing model selection in the second half of 2026, the V4 launch reshapes the decision tree. The default question is no longer “Claude or GPT?” but “open-weight V4-class on managed cloud, or closed-source frontier API?” The answer for cost-sensitive, high-volume workloads – coding agents, document processing, customer-support automation, internal-tool copilots – increasingly tilts open-weight. The answer for absolute-frontier reasoning, regulated industries, and workloads with strict data-residency or model-behavior guarantees may still favor Claude or GPT, but the size of that premium is now visible to procurement teams in a way it was not before April 24.

The deployment options matter. V4-Pro can run on the DeepSeek-hosted API at $1.74 per million input tokens, on AWS Bedrock and Azure for US-jurisdiction compliance, on self-hosted Nvidia H100/H200/B200 clusters for full data control, or on Huawei Ascend supernodes in markets where that is the available silicon. The Hugging Face checkpoint is the single source of truth – every deployment target is downstream of the same weights, with vendor-specific kernel optimizations and quantization recipes layered on top.

Historical Context: How V4 Fits the DeepSeek Arc

The arc from DeepSeek’s first model, DeepSeek-LLM 67B in late 2023, to V4-Pro at 1.6T in April 2026 is roughly a 24x parameter-count expansion in 28 months, with the headline efficiency gains concentrated in the V3 and V4 generations. The pivotal moment for the broader market was the January 2025 R1 release, which produced the largest single-day market-cap loss in stock-market history when Nvidia shares fell roughly 17% on the trading session after the model’s reasoning capabilities became clear. V4 is the second-act pivot from “DeepSeek can match the frontier” to “DeepSeek can match the frontier on Chinese silicon at one-tenth the price.”

The Stanford AI Index 2026 documented that the US-China performance gap on standardized benchmarks narrowed to roughly 2.7 percentage points in the latest reporting period. V4 is the data point that flips that ranking on agentic coding benchmarks specifically. The combination of capability parity and pricing dislocation is what makes the April 24 launch a genuine inflection point rather than another incremental release.

Frequently Asked Questions

When was DeepSeek V4 launched?

DeepSeek V4 launched as a preview release on Friday, April 24, 2026, with open weights for both V4-Pro and V4-Flash published simultaneously on the DeepSeek API documentation portal and on Hugging Face under the MIT license.

How many parameters does DeepSeek V4 have?

V4-Pro has 1.6 trillion total parameters with 49 billion active per token. V4-Flash has 284 billion total parameters with 13 billion active per token. Both are sparse Mixture-of-Experts designs with a 1,000,000-token context window.

What is the API pricing for DeepSeek V4?

V4-Pro lists at $1.74 per million input tokens, and V4-Flash lists at $0.14 per million input tokens on the official DeepSeek API. Output-token pricing varies by tier and is published on the DeepSeek API documentation page.

Does DeepSeek V4 run on Huawei Ascend chips?

Yes. Huawei announced “day zero” full support for DeepSeek V4 on its Ascend AI supernode platform on April 24, 2026. DeepSeek granted Huawei early access to pre-release weights and kernel optimizations to ensure production-grade performance at launch.

How does V4-Pro compare to Claude Opus 4.6 and GPT-5.4?

V4-Pro-Max scores 80.6% on SWE-bench Verified, within 0.2 points of Claude Opus 4.6 and ahead of most public reports for GPT-5.4. The headline trade-off is price: V4-Pro lists at roughly an 8x discount to Claude Opus 4.6 on input tokens, with a 1,000,000-token context that is 5x longer than Claude Opus 4.6’s 200K window.

Is DeepSeek V4 open source?

The weights for both V4-Pro and V4-Flash are released under the MIT license on Hugging Face, allowing free commercial use, modification, and redistribution. The training code, data, and full reproduction recipes are not public, so the release is more accurately described as “open weights” than “fully open source.”

How did the V4 launch affect Nvidia and SMIC stocks?

Nvidia closed at $199.96 on April 24, 2026, a small daily move that left the stock up roughly 13.4% on the month. SMIC jumped approximately 10% in Hong Kong trading on the same session, as investors priced in higher demand for the foundry’s 7nm-class output for Huawei Ascend silicon.

Who founded DeepSeek?

DeepSeek was founded by Liang Wenfeng in May 2023 as the AI research arm of High-Flyer Quant, the Hangzhou-based hedge fund Liang co-founded in 2015. DeepSeek raised its first external funding round in early 2026 at a reported $10 billion valuation after two years of being self-funded by High-Flyer.

Where can I find the DeepSeek V4 model card?

The official preview model card is published on the Hugging Face DeepSeek-V4-Pro page, with launch notes and pricing details on the DeepSeek API documentation portal. Background on DeepSeek the company is summarized on Wikipedia.

Related Coverage

Sources: DeepSeek API documentation, Hugging Face DeepSeek-V4-Pro model card, South China Morning Post analyst coverage, Huawei Ascend supernode disclosure, Stanford AI Index 2026, US Bureau of Industry and Security Advanced Computing rules.

👁 Elias Virtanen

Elias Virtanen

Cybersecurity Analyst

Elias Virtanen is the Cybersecurity Analyst at Tech Insider, bringing hands-on expertise from his background in penetration testing and security consulting. He previously worked as a security researcher at F-Secure in Helsinki, where he focused on threat intelligence and vulnerability disclosure. Elias covers ransomware trends, zero-trust architecture, and the evolving regulatory landscape including NIS2 and the EU Cyber Resilience Act. He holds a CISSP certification and an MSc in Information Security from Aalto University.

View all articles

URL: https://tech-insider.org/deepseek-v4-huawei-ascend-1-6-trillion-parameter-moe-2026/

⇱ DeepSeek V4 on Huawei Ascend: 1.6T MoE [2026]