OpenAI and Anthropic have quietly made opposite choices about how their agents manage context at scale, and those choices now have direct consequences for builders deciding which platform to build on.

The distinction was laid out in sharp detail late last week in an essayist’s analysis on calv.info. The framing is intuitive: OpenAI operates like an Oracle, Anthropic operates like a Firm.

Compaction: the Oracle model. Since Codex shipped native server-side compaction into the Responses API earlier this year, OpenAI’s agents run a single long thread. When the token count approaches the context ceiling, a background process compresses the conversation, retaining only what it judges relevant, and inference continues. The context window stays coherent. Small details that matter to the overall trajectory are preserved in a single chain of reasoning. The tradeoff is control: OpenAI handles compaction server-side, which means the implementation can change without notice to clients, and it means one model’s judgment about what to keep is the entire information filter.

Sub-agent fan-out: the Firm model. Anthropic’s approach, visible in Claude Code at least since Opus 4.1, is to spawn sub-agents for discrete sub-problems. Each sub-agent works within its own context window, completes a task, and passes back a summary to the parent. No single agent sees the full picture. The architecture resembles how human organizations actually operate: individuals with limited information, communicating through language, each solving a piece of the whole.

The practical costs of that parallelism are three, and the calv.info essay names all of them clearly. First, token spend is higher because sub-agents frequently duplicate work, searching the same files without knowing what their siblings already found. Second, perceived throughput looks faster because tokens are produced in parallel rather than serially, but total token consumption is not lower. Third, and most relevant to builders: the Firm architecture introduces a forgetting risk that compaction alone does not. If a sub-agent decides a fact is not worth including in its summary, that fact disappears. The parent never had it. The user may later get a confidently wrong answer from a model that genuinely did the research but summarized it away.

Compaction introduces the same risk at the compression step, but with a smaller blast radius. One model, one compression decision, one thread. A sub-agent architecture multiplies that decision point by however many agents are running.

So when is each the right call?

Choose the Oracle pattern when coherence across a long task is the primary requirement. Complex, stateful workflows where early decisions constrain later ones, or tasks where context built in step three is still needed in step twenty, favor a single-thread compaction approach. The Oracle pays for coherence with server-side opacity and a narrower parallelism ceiling.

Choose the Firm pattern when the task genuinely decomposes into independent sub-problems with clean interfaces between them. Code review across a large repository, parallel data extraction from many sources, or any workflow where sub-tasks are genuinely isolated map naturally to a fan-out architecture. The cost is coordination overhead and the risk that important details fall out of summaries. If your workload punishes that forgetting risk, the Firm architecture requires compensating controls: stricter summary schemas, post-hoc verification steps, or explicit context-passing between agents.

The calv.info essay ends with an expectation that both approaches will converge: Anthropic will improve compaction, OpenAI will build out multi-agent infrastructure. That is probably correct over a two-year window. Over the next quarter, teams shipping agentic products should audit whether their task structure is Oracle-shaped or Firm-shaped before committing to an SDK or platform.

Source: calv.info, “The Oracle and the Firm,” published June 13, 2026.