IR AISep 16, 2025

Membership Inference Attack against Large Language Model-based Recommendation Systems: A New Distillation-based Paradigm

Li Cuihong, Huang Xiaowen, Yin Chuanhuan, Sang Jitao

arXiv:2511.14763v1h-index: 10

Originality Incremental advance

AI Analysis

This addresses security and privacy concerns for users of LLM-based recommendation systems by enhancing the ability to infer whether specific data was used in training, though it is an incremental advancement in attack methods.

The paper tackles the challenge of performing membership inference attacks on large language model-based recommendation systems by proposing a new knowledge distillation-based paradigm, which improves attack performance compared to traditional shadow model-based methods.

Membership Inference Attack (MIA) aims to determine if a data sample is used in the training dataset of a target model. Traditional MIA obtains feature of target model via shadow models and uses the feature to train attack model, but the scale and complexity of training or fine-tuning data for large language model (LLM)-based recommendation systems make shadow models difficult to construct. Knowledge distillation as a method for extracting knowledge contributes to construct a stronger reference model. Knowledge distillation enables separate distillation for member and non-member data during the distillation process, enhancing the model's discriminative capability between the two in MIA. This paper propose a knowledge distillation-based MIA paradigm to improve the performance of membership inference attacks on LLM-based recommendation systems. Our paradigm introduces knowledge distillation to obtain a reference model, which enhances the reference model's ability to distinguish between member and non-member data. We obtain individual features from the reference model and train our attack model with fused feature. Our paradigm improves the attack performance of MIA compared to shadow model-based attack.

View on arXiv PDF

Similar