Across a broad sample of software organizations that have adopted AI coding tools, the median pull request throughput gain is 8 percent. That number comes from DX, the developer-experience research firm, which presented its longitudinal study at DX Annual on June 8. It is the most direct empirical challenge yet to the headline figures circulating from AI vendors and AI-native organizations.

The 8 percent is not a failure. DX co-founder and CEO Abi Noda, speaking alongside Brian Houck of Microsoft, was explicit: even single-digit throughput gains compound meaningfully across teams of hundreds or thousands of engineers. The point is not that AI tools are useless. The point is that the upper-range claims measure something different.

Anthropic cited an 8x improvement in engineer output last week. That figure came from inside an AI-native organization where the surrounding workflow had been redesigned alongside the tools. DX is measuring typical adoption: AI coding assistants dropped into existing review, planning, testing, and coordination workflows. Those two data points are not in conflict. They describe two different deployment contexts. Most organizations live in the second one.

The structural reason for the gap is time allocation. Developers spend roughly 14 percent of their working hours writing code, according to DX’s data. AI tools that primarily accelerate code generation are therefore improving a 14-percent slice of the total work surface. Planning, reviews, testing, documentation, and team coordination consume the remaining 86 percent. Making the minority task faster produces a minority result.

Organizations that reach the top of the 10 to 15 percent range share a common trait: they redesigned their review and coordination workflows alongside their AI adoption. Companies running the same tools inside unchanged legacy processes cluster at 5 to 8 percent. Adoption depth matters more than tool selection. Engineers who use AI for most tasks outperform engineers who use it occasionally, regardless of which assistant they chose.

DX also names a failure mode worth tracking: false velocity. Pull request counts can rise without proportional roadmap delivery. Teams can ship more code faster while adding technical debt, increasing review pressure, and building weaker mental models of the systems they maintain. Noda described this as “cognitive debt,” a condition where output accelerates and comprehension lags. The full cost shows up months later in debugging cycles and maintenance burden, not in the throughput dashboards teams are currently optimizing.

The subsidy angle matters here. If the typical productivity gain from AI coding tools is 8 percent and those tools are priced at a fraction of their true cost because vendors are burning cash to build market share, the ROI math changes sharply once pricing normalizes. An 8 percent throughput gain that cost five dollars per engineer per month looks different when it costs fifty. Engineering leaders building business cases around vendor productivity claims should pressure-test those claims against DX’s population-level numbers, not the AI-native outliers.

The honest read: adopting AI coding tools is worth doing. Getting to 10 to 15 percent requires redesigning the workflows that surround the code, not just the code itself. Getting to the range Anthropic measured requires a different organizational structure entirely. Know which category your team is actually in before committing budget to headcount offsets.

Teams currently benchmarking AI tool ROI should run DX’s framing against their own cycle-time data: if your review and coordination latency is unchanged, your throughput ceiling is already visible in the 8 percent median.

DX newsletter (getdx.com), published June 8, 2026.