CR CVApr 7

BodhiPromptShield: Pre-Inference Prompt Mediation for Suppressing Privacy Propagation in LLM/VLM Agents

arXiv:2604.0579348.8Has Code

Predicted impact top 41% in CR · last 90 daysOriginality Incremental advance

AI Analysis

This addresses privacy leakage in multi-stage AI systems for enterprise or sensitive applications, but it is incremental as it builds on existing de-identification methods.

The paper tackles the problem of privacy risk propagation across multiple stages in LLM/VLM agents by proposing BodhiPromptShield, a framework that uses detection and routing techniques to suppress propagation from 10.7% to 7.1% on a benchmark.

In LLM/VLM agents, prompt privacy risk propagates beyond a single model call because raw user content can flow into retrieval queries, memory writes, tool calls, and logs. Existing de-identification pipelines address document boundaries but not this cross-stage propagation. We propose BodhiPromptShield, a policy-aware framework that detects sensitive spans, routes them via typed placeholders, semantic abstraction, or secure symbolic mapping, and delays restoration to authorized boundaries. Relative to enterprise redaction, this adds explicit propagation-aware mediation and restoration timing as a security variable. Under controlled evaluation on the Controlled Prompt-Privacy Benchmark (CPPB), stage-wise propagation suppresses from 10.7\% to 7.1\% across retrieval, memory, and tool stages; PER reaches 9.3\% with 0.94 AC and 0.92 TSR, outperforming generic de-identification. These are controlled systems results on CPPB rather than formal privacy guarantees or public-benchmark transfer claims. The project repository is available at https://github.com/mabo1215/BodhiPromptShield.git.

View on arXiv PDF Code

Similar