CLCRJul 1, 2024

Protecting Privacy in Classifiers by Token Manipulation

arXiv:2407.01334v227 citationsh-index: 17
AI Analysis

This addresses privacy concerns for users of remote language model services, though it appears incremental as it builds on existing text manipulation approaches.

The paper tackles the problem of protecting private information when using remote language model classifiers by manipulating text tokens to prevent data exposure while maintaining classification accuracy. They found that contextualized manipulation provides better performance than simple token mapping functions, which degrade task performance and can be reconstructed by attackers.

Using language models as a remote service entails sending private information to an untrusted provider. In addition, potential eavesdroppers can intercept the messages, thereby exposing the information. In this work, we explore the prospects of avoiding such data exposure at the level of text manipulation. We focus on text classification models, examining various token mapping and contextualized manipulation functions in order to see whether classifier accuracy may be maintained while keeping the original text unrecoverable. We find that although some token mapping functions are easy and straightforward to implement, they heavily influence performance on the downstream task, and via a sophisticated attacker can be reconstructed. In comparison, the contextualized manipulation provides an improvement in performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes