LGNov 1, 2024

Black-Box Forgetting

Yusuke Kuwana, Yuta Goto, Takashi Shibata, Go Irie

arXiv:2411.00409v16.43 citationsh-index: 3Has CodeNIPS

Originality Incremental advance

AI Analysis

This addresses the need for privacy and efficiency in practical applications by enabling selective forgetting in commercial or restricted black-box models, though it is incremental as it adapts forgetting techniques to a new setting.

The paper tackles the problem of selective forgetting in large-scale pre-trained models, where existing methods require white-box access, by proposing a black-box approach that optimizes input prompts to reduce accuracy for specified classes while maintaining performance on others, achieving superior results on four benchmark datasets.

Large-scale pre-trained models (PTMs) provide remarkable zero-shot classification capability covering a wide variety of object classes. However, practical applications do not always require the classification of all kinds of objects, and leaving the model capable of recognizing unnecessary classes not only degrades overall accuracy but also leads to operational disadvantages. To mitigate this issue, we explore the selective forgetting problem for PTMs, where the task is to make the model unable to recognize only the specified classes while maintaining accuracy for the rest. All the existing methods assume "white-box" settings, where model information such as architectures, parameters, and gradients is available for training. However, PTMs are often "black-box," where information on such models is unavailable for commercial reasons or social responsibilities. In this paper, we address a novel problem of selective forgetting for black-box models, named Black-Box Forgetting, and propose an approach to the problem. Given that information on the model is unavailable, we optimize the input prompt to decrease the accuracy of specified classes through derivative-free optimization. To avoid difficult high-dimensional optimization while ensuring high forgetting performance, we propose Latent Context Sharing, which introduces common low-dimensional latent components among multiple tokens for the prompt. Experiments on four standard benchmark datasets demonstrate the superiority of our method with reasonable baselines. The code is available at https://github.com/yusukekwn/Black-Box-Forgetting.

View on arXiv PDF Code

Similar