The cheapest line in the AI builder’s budget may be the most dangerous one. Writing on ea.rna.nl, analyst Gerben Wierda ran the most publicly rigorous attempt to quantify the gap between what frontier labs charge and what serious AI use actually costs them, and his range spans from a 2.5x subsidy at the low end to roughly 10-12x for a heavy subscription user running agentic coding workflows flat out.
The framing device is a personal one. Wierda spent four months building a 40,000-line C++ application using Claude Code, tracking both his subscription costs and the API-equivalent spend on the same workload. His conclusion: the $100 Claude Max subscription absorbed token consumption that would have run him approximately $450 at API rates, a 4.5x subsidy on his actual usage pattern. At maximum weekly utilization, with no human in the loop and full agentic mode enabled, he estimates that ratio reaches 12x.
The reason is not complicated. Per-token pricing has fallen dramatically, but the token count per task has risen faster. A simple question in early 2023 ran roughly 200 tokens in, 400 out. A complex multi-file coding task on a 35,000-plus line codebase today, run at high effort with extended thinking, consumes an estimated 5-7 million cumulative input tokens across tool calls, context resends, and invisible recursive computation. At Opus 4.6 API rates of $5 per million input tokens and $25 per million output tokens, and with thinking tokens billed at output rates, a single resolved coding task lands in the $55-75 range. A software engineer doing five of those in a working day generates $275-375 in API cost. The $100 monthly subscription does not cover that.
Wierda’s methodology is transparent about its limits. He relied on Claude itself to supply historical token-per-task estimates across model generations, cross-checked against his own session logs, and he flags the places where those estimates required correction. The underlying economics are not contested: what is genuinely uncertain is exactly where the subsidy line sits for any given workload. His headline worst-case ratio of more than 10-to-1 applies to a user maximizing agentic throughput. Most Claude Max subscribers are nowhere near weekly limits, which is precisely how flat-rate billing absorbs the cost asymmetry.
The reason this matters beyond one person’s coding experiment is structural. The subsidy is funded by venture capital. Anthropic closed a $65 billion Series H and is navigating toward an IPO. OpenAI is doing the same. The argument Wierda makes, and that three other independent analytical threads are converging on, is that current pricing is a pre-IPO posture. Bain’s recent ROI survey found that enterprise buyers are capturing less value than they expected from AI investments; Bloomberg’s trillion-dollar IPO coverage raised the institutional pricing question explicitly. This analysis closes the loop from the supply side: the financial reality of inference does not match the sticker price, and that divergence cannot persist once public-market investors require a path to unit economics.
There is a partial counterargument. Model efficiency is improving. Wierda acknowledges the mid-2025 Opus price cut (from $15/$75 per million tokens to $5/$25) as a real event, not an accounting move. But he also documents that Anthropic’s subsequent Opus 4.7 and 4.8 releases appear to constrain the recursive brute-force budget that made Opus 4.6 effective at complex coding, which he reads as the lab managing subsidy exposure rather than delivering capability improvements. If that reading is correct, the cost reduction comes at a quality cost, not a pure efficiency gain.
For builders, the practical signal is simple and uncomfortable: agentic workflows that are economically viable at subscription pricing may not survive a 5-10x pricing normalization. That is not a speculative scenario; it is the range Wierda’s own subsidy estimates imply if labs move to cost-recovery pricing before or shortly after IPO. Production systems built today should degrade gracefully at materially higher per-task costs, which means caching aggressively, using budget models for fault-tolerant subtasks, and auditing which agent loops actually require frontier recursive models versus which ones are using brute force because it is currently cheap to do so.
Gerben Wierda on ea.rna.nl, 2026-06-07.