VOOZH about

URL: https://thenewstack.io/jetbrains-mellum2-open-source-coding-model/

⇱ JetBrains open-sources Mellum2 to go where Claude Code can't - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2026-06-01 16:47:26
JetBrains open-sources Mellum2 to go where Claude Code can't
AI Agents / AI Models / Open Source

JetBrains open-sources Mellum2 to go where Claude Code can’t

Mellum2 is fast, open source, and runs entirely on your own infrastructure — a challenge to coding tools that depend on third-party APIs to function.
Jun 1st, 2026 4:47pm by Paul Sawers
👁 Featued image for: JetBrains open-sources Mellum2 to go where Claude Code can’t
Photo by Taylor Vick on Unsplash

JetBrains announced on Monday it has open-sourced Mellum2, a 12B-parameter coding model aimed at the infrastructure layer of agentic AI systems — routing, retrieval pipelines, and sub-agent tasks — as well as private on-premises deployment, somewhere Claude Code and its ilk can’t go.

It’s the follow-on to Mellum, a 4B-parameter model that JetBrains debuted in late 2024 as a proprietary code completion tool for its own IDEs before open-sourcing it in April 2025. But unlike its predecessor, Mellum2 is open from day one.

It’s worth noting that Mellum2’s scope has also changed considerably. Where Mellum did one thing — code completion — Mellum2 is built for the broader set of tasks that now define how engineering teams are deploying AI: coordinating between models, handling sub-agent workloads, compressing context in retrieval pipelines, and running inference on infrastructure teams control themselves.

Mellum2 is built for the broader set of tasks that now define how engineering teams are deploying AI.

In a blog post co-authored by staff research engineer Nikita Pavlichenko and product manager Anton Semenkin, JetBrains describes Mellum2 as a “focal model” — fast and specialized, rather than competing with frontier models on breadth.

“Frontier models will continue to push the limits, but practical AI products also require focal models: fast, specialized components that handle high-frequency tasks efficiently,” they write. “This specialization ensures the model excels in software engineering environments while remaining lean and fast.”

Additionally, two post-trained variants ship alongside the base model: an “instruct” version that answers directly, and a “thinking” version that produces an explicit reasoning trace before responding, aimed at harder multi-step and agentic tasks.

Built for speed at scale

Mellum2 uses a Mixture-of-Experts (MoE) architecture, with 12B total parameters but only 2.5B active per token. The design routes each token through a subset of the model’s 64 experts rather than the full network, which keeps inference fast without sacrificing the model’s overall capacity.

In its technical report, JetBrains benchmarked Mellum2 against Alibaba’s Qwen2.5-7B and Qwen3-8B on a single H100 GPU, using input and output sizes representative of real production code completion workloads.

In single-request mode, it matches Qwen2.5-7B almost exactly — 192 tokens per second versus 193. Under concurrent load, which is where production deployments actually operate, it pulls 21% ahead of Qwen2.5-7B and 79% ahead of Qwen3-8B.

The cost profile follows the same logic. With only 2.5B parameters active per token, the architecture is designed to behave more like a 2.5B model than a conventional 12B dense model from an inference perspective — relevant for teams routing high volumes of requests through it daily as part of a larger agentic system.

On function-level code generation, measured by the EvalPlus benchmark combining HumanEval+ and MBPP+, Mellum2 scores 78.4% in its thinking variant — ahead of the other models included in the comparison table, including Qwen3.5-9B at 71.8% and the code-specialized Seed-Coder-8B at 73.8%.

The picture becomes more mixed once the evaluation moves beyond software-engineering tasks. JetBrains’ own results show that Qwen3.5-9B retains an advantage in broader reasoning and knowledge evaluations, including GPQA Diamond and MMLU-Redux.

JetBrains acknowledges this directly in its technical report, noting that the model’s narrower training focus comes at a cost.

“The gap reflects a deliberate tradeoff in our training mix toward code and developer documentation rather than broad encyclopedic coverage,” the authors write.

The dependency argument

The more pointed case for Mellum2, perhaps, is about what it doesn’t require. Anthropic’s Claude Code and OpenAI’s Codex run locally on the client but route inference through Anthropic and OpenAI’s APIs, respectively.

Cursor, for what it’s worth, is also dabbling with its own proprietary coding model strategy, recently introducing Composer 2.5. Those capabilities remain tied to Cursor’s platform, while the company’s recently announced partnership with SpaceX’s xAI places another critical layer of the stack — infrastructure and future model development — outside customers’ control.

Mellum2 arrives with open weights under Apache 2.0, giving enterprises the option to own and operate that layer themselves. Whether that argument gains traction at enterprise scale will depend on enterprise appetite for self-hosted AI infrastructure.

JetBrains is betting that deployment flexibility, operational control, and ownership will remain important considerations as AI becomes more deeply embedded in software engineering workflows. A reasonable bet — but one that remains to be proven at scale.

Mellum2 is now available on Hugging Face, with base, instruct, and thinking checkpoints released under Apache 2.0, along with the full technical report detailing the architectural decisions and training pipeline behind it.

TRENDING STORIES
Paul is an experienced technology journalist covering some of the biggest stories from Europe and beyond, most recently at TechCrunch where he covered startups, enterprise, Big Tech, infrastructure, open source, AI, regulation, and more. Based in London, these days Paul...
Read more from Paul Sawers
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Anthropic, OpenAI.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.