Deep research agents leak private data through their web queries

ServiceNow's MosaicLeaks benchmark shows that simply telling an agent to stay quiet does almost nothing, but training it with privacy-aware rewards cuts leakage from 34% to 9.9%.

Alessandro Benigni

PUBLISHED JUN 20, 2026

4 MIN READ

Follow on Google

-841 MIN AGO

Deep research agents leak private data through their web queries — featured image for AI Insiders

Your research agent is not just searching the web. It is broadcasting what it knows about your organization’s internal documents, one query at a time.

ServiceNow researchers Alexander Gurung and Rafael Pardinas published a study on June 19 via Hugging Face that quantifies this risk with a new benchmark called MosaicLeaks. The core finding: when a deep research agent mixes private enterprise documents with open-web retrieval, its outbound query log alone gives an outside observer enough material to reconstruct confidential facts. No access to the internal documents required.

The mechanism is structural, not incidental. A multi-hop research task forces the agent to ground each web query in what it found locally. A search referencing “MediConn,” “70%,” and “January” looks innocuous in isolation. Assembled across a dozen queries, it pins down the exact content of a private cloud-migration report. The researchers call this the mosaic effect.

ServiceNow built MosaicLeaks around 1,001 question chains that deliberately interleave private and public sub-questions. Each chain is constructed so the answer to a local hop becomes the seed phrase for the next web hop. The benchmark grades leakage at three severity levels: intent leakage (an observer can infer what the agent is researching), answer leakage (an observer can answer private questions from the query log alone), and full-information leakage (an observer can state verifiable private facts without being told what to look for).

Across all tested models, the baseline rate of answer or full-information leakage was 34.0%.

The intuitive fix does not work. Adding a privacy instruction to the agent’s planning prompt, telling it not to embed local details in web queries, lowered leakage slightly for some models but produced inconsistent results and frequently hurt task performance. For Qwen3-4B, the prompt brought leakage from 34.0% down to 25.5%. It also dropped research accuracy from 48.7% to 44.5%. The reason: the model responded mainly by issuing fewer queries overall, not by constructing safer ones.

Standard performance training made things worse. Training the agent purely to complete more research chains correctly raised accuracy to 59.3% but pushed leakage to 51.7%. A model that is better at retrieving documents learns to pack more context into its queries. Richer queries are better for retrieval and worse for privacy. The two objectives are directly opposed.

ServiceNow’s solution is PA-DR (Privacy-Aware Deep Research), a training method that rewards query construction at the individual planning step rather than at the trajectory level. A second component uses a Qwen3-4B classifier to estimate two risks on every planned web query: whether the query itself leaks private data, and whether it creates a new mosaic leak when added to the cumulative query log. The training signal penalizes whichever risk is larger, so the privacy cost lands precisely on the decision that caused the exposure.

The results from the published Hugging Face post:

Base Qwen3-4B: 48.7% chain accuracy, 34.0% leakage
Task-only RL: 59.3% accuracy, 51.7% leakage
Task plus PA-DR: 58.7% accuracy, 9.9% leakage

The 9.9% figure is not just an improvement over task-only training. It is lower than the untrained baseline. The agent under PA-DR actually issues more queries than the base model, but it strips the revealing details: specific metrics, named entities, and answer-shape clues that let an observer reconstruct what the agent found privately.

The benchmark uses synthetic enterprise documents and a controlled web corpus, so these numbers describe a laboratory setting, not a production deployment. The authors are explicit on this point. Real deployments with open-ended tasks and heterogeneous document types would need separate evaluation.

But the diagnostic finding holds regardless of setting. Privacy cannot be prompted in. Telling an agent to be careful about what it searches is behaviorally incoherent: the agent has no native way to reason about the cumulative information content of its query history, so it defaults to searching less. Training with a reward that penalizes cumulative exposure teaches the agent to reason about what it reveals across the session, not just in the current step.

Any team running research agents over a combination of proprietary data and public web retrieval is operating with this exposure today. The MosaicLeaks benchmark and PA-DR training approach give that team a concrete measurement framework and a mitigation path that does not require trading away task accuracy to get there.

ServiceNow researchers published MosaicLeaks on Hugging Face on June 19, 2026, with a preprint available at arXiv

.30727.

Deep research agents leak private data through their web queries

The morning brief for people inside the AI industry.

More in Agents

Claude Code Gets Artifacts: Shareable Live Pages From Your Work Session

Perplexity Builds Persistent Memory Into Its Agent Platform

Replit Plugs Into Claude, Closing the Design-to-Deploy Gap