JetBrains ships Mellum 2, a 12B MoE model built for IDE-native coding

The second-generation coding model targets code completion, tool use, and agentic workflows inside JetBrains IDEs, cutting inference cost through sparse activation.

Alessandro Benigni

PUBLISHED JUN 3, 2026

3 MIN READ

Follow on Google

JUN 3, 2026

JetBrains ships Mellum 2, a 12B MoE model built for IDE-native coding — featured image for AI Insiders

JetBrains has released Mellum 2, a 12-billion-parameter mixture-of-experts language model built specifically for coding, code reasoning, tool use, and agentic workflows. The technical report, published June 1 on arXiv, confirms this is the successor to the original Mellum, a smaller dense model JetBrains shipped in 2025.

The MoE architecture is the key engineering decision here. A 12B total-parameter model with sparse activation means that only a fraction of those parameters fire on any given token, keeping inference compute and latency low. For an IDE vendor serving millions of completion requests per session, that cost profile matters more than headline parameter count. JetBrains can run competitive completion speeds without paying the per-query cost of a dense model at comparable capability.

The IDE coding-model market has fragmented into three distinct strategies. GitHub Copilot licenses OpenAI and Anthropic models and routes between them. Cursor, covered in an earlier edition, trained a proprietary completion model tuned to its own context window and diff format. JetBrains now owns a similar vertical position: a model it can iterate on independently, without waiting for OpenAI or Anthropic to push a release. For the roughly 15 million active JetBrains IDE users across IntelliJ, PyCharm, WebStorm, and GoLand, Mellum 2 is the engine behind code completion and agent features whether they configure it explicitly or not.

The arXiv paper covers the architecture and training design. The abstract does not specify the active parameter count under sparse activation, which is the number that determines real-world inference cost and latency. JetBrains has not released independent benchmark comparisons against Copilot or Cursor in the published material. That absence is worth noting: the announcement positions Mellum 2 on architecture and use-case fit, not on raw benchmark scores, which suggests JetBrains knows its advantage is integration depth rather than capability headroom over OpenAI-backed alternatives.

The broader pattern is consolidation of model ownership inside the IDE layer. Three years ago, IDE AI was almost entirely an API-consumption story: plug in an OpenAI key, call completions. That model is eroding. Cursor trained its own. JetBrains trained its own. The implication for the next generation of developer tools is that the underlying model becomes a product differentiation axis, subject to the same roadmap control as the editor itself. A team relying on Cursor’s completion model is betting on Cursor’s training priorities, not OpenAI’s.

Skepticism is warranted on capability ceiling. JetBrains releases are technically credible and the arXiv report follows the form of serious model papers. But closing the gap against GitHub Copilot backed by GPT-4o and Claude Sonnet on open-ended code generation tasks is a different challenge than winning on latency and IDE integration. Mellum 2’s pitch is depth of integration and model control. That pitch is real. It is not a claim to frontier benchmark performance.

Teams already on JetBrains IDEs do not need to change anything; Mellum 2 will ship through normal IDE updates. Teams currently evaluating whether to standardize on IntelliJ-family IDEs versus VS Code with Cursor should add Mellum 2’s completion latency and agentic reliability to their benchmark checklist before committing in the second half of 2026.

Based on the Mellum 2 technical report published by JetBrains on arXiv on June 1, 2026.

JetBrains ships Mellum 2, a 12B MoE model built for IDE-native coding

The morning brief for people inside the AI industry.

More in Models

A Baseten Engineer Traces Why Kimi K3 Beats Scale Alone

InclusionAI's LLaDA2.2 pushes diffusion language models past 100B

Looped transformers decouple parameter count from capability