SD AIMar 30

Membership Inference Attacks against Large Audio Language Models

arXiv:2603.2837870.31 citationsh-index: 6

AI Analysis

This work addresses the problem of reliable security auditing for large audio language models, which is crucial for developers and users concerned about privacy risks, though it is incremental as it builds on existing MIA methods by introducing a new evaluation framework.

The authors tackled the problem of evaluating membership inference attacks (MIA) on large audio language models (LALMs) by identifying that audio data's non-semantic information causes severe train/test distribution shifts, leading to spurious MIA performance. They demonstrated that common speech datasets show near-perfect train/test separability (AUC ≈ 1.0) without model inference and that standard MIA scores strongly correlate with blind acoustic artifacts (correlation > 0.7), establishing a principled standard for auditing LALMs.

We present the first systematic Membership Inference Attack (MIA) evaluation of Large Audio Language Models (LALMs). As audio encodes non-semantic information, it induces severe train and test distribution shifts and can lead to spurious MIA performance. Using a multi-modal blind baseline based on textual, spectral, and prosodic features, we demonstrate that common speech datasets exhibit near-perfect train/test separability (AUC approximately 1.0) even without model inference, and the standard MIA scores strongly correlate with these blind acoustic artifacts (correlation greater than 0.7). Using this blind baseline, we identify that distribution-matched datasets enable reliable MIA evaluation without distribution shift confounds. We benchmark multiple MIA methods and conduct modality disentanglement experiments on these datasets. The results reveal that LALM memorization is cross-modal, arising only from binding a speaker's vocal identity with its text. These findings establish a principled standard for auditing LALMs beyond spurious correlations.

View on arXiv PDF

Similar