CRLGMay 27

MRMMIA: Membership Inference Attacks on Memory in Chat Agents

arXiv:2605.2782518.8h-index: 5
Predicted impact top 18% in CR · last 90 daysOriginality Incremental advance
AI Analysis

It addresses the overlooked privacy risk of membership inference in chat agent memory, which can contain sensitive user interactions and preferences.

The paper introduces MRMMIA, a membership inference attack targeting chat agent memory, and demonstrates that it consistently outperforms baselines across black-box, gray-box, and white-box settings, exposing privacy risks in agent memory systems.

Membership inference attacks (MIAs) test whether a target data record belongs to a system's private data, and have become a standard tool to measure privacy leakage in machine learning systems. Prior work has primarily focused on training corpora or retrieval databases. However, MIAs against agent memory have received less attention, even though such memory can contain sensitive user-agent interactions, retrieved facts, and user preferences. Therefore, in this work, we focus on chat agent memory MIAs, where an adversary infers whether a candidate memory unit belongs to the chat agent's memory store. We propose Multi-Recall Memory MIA (MRMMIA), a unified attack that utilizes multiple recall probes to the agent to extract the membership signal across black-box, gray-box, and white-box settings. Our experiments demonstrate that MRMMIA consistently outperforms baselines. Our results expose the privacy risk in agents and provide an initial evaluation framework for membership leakage in chat-agent memory systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes