Microsoft puts token cost on model release cards

MAI-Code-1-Flash ships with average token usage per task, making efficiency a first-class benchmark alongside raw capability.

Alessandro Benigni

PUBLISHED JUN 5, 2026

1 MIN READ

Follow on Google

-1119 MIN AGO

Microsoft published the release card for MAI-Code-1-Flash with a metric no major lab had attached before: average token usage per task. The model hits 71.6 on SWE-Bench Verified while consuming roughly a third of the tokens Claude Haiku 4.5 burns on the same benchmark.

Venture capitalist Tom Tunguz, writing on tomtunguz.com, called it the opening signal of a procurement shift. Buyers can now compare models on two axes at once: raw score and the cost to achieve it. A 95-percent result that burns three times the tokens is no longer self-evidently better than a 92-percent result at baseline cost.

The framing lands against a backdrop of real corporate sticker shock. Uber capped employee AI spending after exhausting its budget in four months. Salesforce committed $300M on Anthropic tokens and froze engineering hires. Microsoft itself pulled Claude Code licenses across its Windows and Microsoft 365 divisions when usage outran budget.

Procurement teams now have a vendor-supplied number to enforce what finance already wants. Other providers will face pressure to match the format within this quarter or let Microsoft define the efficiency narrative alone.

Tom Tunguz on tomtunguz.com, 2026-06-03.

Microsoft puts token cost on model release cards

The morning brief for people inside the AI industry.

More in Wire

Ideogram releases open-weight image model built on JSON prompts

Morgan Stanley opens wealth platforms to AI agents via MCP

Tilde Research open-sources an attention variant built for forgetting