CL LGApr 30, 2024

Safe Training with Sensitive In-domain Data: Leveraging Data Fragmentation To Mitigate Linkage Attacks

arXiv:2404.19486v11.0h-index: 2

Originality Incremental advance

AI Analysis

This addresses privacy risks for domains like healthcare by preventing re-identification of individuals in model outputs, though it is incremental as it builds on existing fine-tuning methods.

The paper tackles the problem of sensitive data exposure in text generation models by proposing a method that uses fragmented data instead of full texts to mitigate linkage attacks, and demonstrates that fine-tuning LLMs with fragmented data yields classification results comparable to using full data, with specific results showing comparable performance in predicting cardiovascular diagnoses.

Current text generation models are trained using real data which can potentially contain sensitive information, such as confidential patient information and the like. Under certain conditions output of the training data which they have memorised can be triggered, exposing sensitive data. To mitigate against this risk we propose a safer alternative which sees fragmented data in the form of domain-specific short phrases randomly grouped together shared instead of full texts. Thus, text fragments that could re-identify an individual cannot be reproduced by the model in one sequence, giving significant protection against linkage attacks. We fine-tune several state-of-the-art LLMs using meaningful syntactic chunks to explore their utility. In particular, we fine-tune BERT-based models to predict two cardiovascular diagnoses. Our results demonstrate the capacity of LLMs to benefit from the pre-trained knowledge and deliver classification results when fine-tuned with fragmented data comparable to fine-tuning with full training data.

View on arXiv PDF

Similar