Anthropic published defending-code-reference-harness to GitHub on June 4, an open-source reference implementation that wires Claude into a seven-stage autonomous pipeline covering recon, vulnerability discovery, crash verification, deduplication, exploitability reporting, and patch generation.

The release sits at the quiet end of a two-pronged security strategy. On the offensive side, Anthropic has kept Claude Mythos gated inside the Glasswing consortium, limiting access to vetted security researchers. On the defensive side, the harness ships as a public fork-and-customize starting point, available to anyone with a Claude API key, including access through Bedrock, Vertex, or Azure. The architecture is the same; the audience is different.

The pipeline ships pre-configured for C and C++ memory vulnerabilities using ASAN (AddressSanitizer, the memory error detector). Each find agent runs inside a gVisor sandbox with egress restricted to the Claude API, and a separate grader agent independently reproduces every crash before it reaches the report stage. Anthropic is explicit that the repo is a reference, not a maintained product, and is not accepting contributions. Teams that want to port it to another language or vulnerability class use the included /customize skill to answer three questions for their stack: what counts as a finding, what a proof of concept looks like, and how the target gets built and run.

The business structure is visible in the repo itself. A callout block near the top points teams toward Claude Security, Anthropic’s hosted product that runs the same architecture across multiple projects with managed triage, fix validation, and lifecycle tracking. The open-source harness is the feeder. Claude Security is the upgrade.

This pattern responds directly to the kind of testing described in yesterday’s edition covering Kasra’s LLM guardrail probing. Rather than waiting for researchers to find workarounds, Anthropic is handing security teams a fully sanctioned defensive workflow with first-class Claude integration, sandboxing documentation, and an explicit ramp-up schedule designed to get a team from first threat model to autonomous scanning in two weeks.

Anthropic acknowledges the limits. The repo notes that autonomous triage and patching “are still open issues” and that the harness does not fully solve them. Severity prioritization requires human judgment about the specific environment, and verified patches are not always upstreamable without additional engineering work. Anthropic recommends budgeting real engineering time for those two steps, based on bottlenecks reported by early partner teams.

The release announcement does not include independent benchmark data on detection rates or false-positive ratios from the Glasswing partner deployments.

Security engineering teams already using Claude Code will find the integration path short: clone the repo, run /quickstart, and the first threat model and static scan can complete on day one without a sandbox. The autonomous pipeline, which executes target code, requires a one-time gVisor setup before it will start.

Teams evaluating AI-assisted security tooling in the next quarter should run the harness against an internal C/C++ library before pricing Claude Security; the open version surfaces the real integration cost before any contract conversation.

Anthropic on GitHub (github.com/anthropics), 2026-06-04.