Cohere ships North Mini Code, an open-weight coder for sovereign deployments

The 30B MoE model runs on a single H100 with only 3B active parameters, targeting regulated environments that cannot use closed-lab APIs.

Alessandro Benigni

PUBLISHED JUN 11, 2026

1 MIN READ

Follow on Google

YESTERDAY

Cohere ships North Mini Code, an open-weight coder for sovereign deployments — featured image for AI Insiders

Cohere released North Mini Code under an Apache 2.0 license on June 9, adding a coding-focused model to its North sovereign-AI family as closed-lab frontier capability rises. The model carries 30B total parameters but activates only 3B at inference time, a MoE design that cuts hardware requirements to a single H100 at FP8 precision.

The sovereign framing is deliberate. Cohere positions North Mini Code for enterprises and governments where data residency rules, supply-chain controls, or regulatory audit requirements rule out sending code to a closed API. A 30B parameter count satisfies audit requirements; a 3B-active inference footprint makes on-premise deployment practical.

Cohere’s internal benchmarks show 2.8x higher output throughput versus Devstral Small 2 at identical concurrency, with a 30% inter-token latency advantage. Devstral maintained a slight edge on time-to-first-token. The benchmarks are internal and not independently verified.

Teams evaluating open-weight coding agents for regulated environments now have a directly comparable alternative to Mistral’s Devstral line, with Apache 2.0 removing the licence friction that ruled out some open-weight options.

Cohere published this announcement on the Cohere blog (cohere.com/blog) on 2026-06-09.

Cohere ships North Mini Code, an open-weight coder for sovereign deployments

The morning brief for people inside the AI industry.

More in Wire

Kernel fusion is where PyTorch inference speed actually hides

NVIDIA ships open-source scanner for agent skill supply-chain risk

Cursor's Bugbot is 3x faster, 22% cheaper, and catches more bugs