MemBench: Memorized Image Trigger Prompt Dataset for Diffusion Models
This work addresses the lack of benchmarks for assessing memorization mitigation in diffusion models, which is crucial for developers and users concerned with copyright and privacy issues, though it is incremental as it builds on prior studies.
The authors tackled the problem of diffusion models generating replicated images from training data due to memorization, which raises copyright and privacy concerns, by creating MemBench, the first benchmark for evaluating memorization mitigation methods, and found that existing methods are insufficient for practical application.
Diffusion models have achieved remarkable success in Text-to-Image generation tasks, leading to the development of many commercial models. However, recent studies have reported that diffusion models often generate replicated images in train data when triggered by specific prompts, potentially raising social issues ranging from copyright to privacy concerns. To sidestep the memorization, there have been recent studies for developing memorization mitigation methods for diffusion models. Nevertheless, the lack of benchmarks impedes the assessment of the true effectiveness of these methods. In this work, we present MemBench, the first benchmark for evaluating image memorization mitigation methods. Our benchmark includes a large number of memorized image trigger prompts in various Text-to-Image diffusion models. Furthermore, in contrast to the prior work evaluating mitigation performance only on trigger prompts, we present metrics evaluating on both trigger prompts and general prompts, so that we can see whether mitigation methods address the memorization issue while maintaining performance for general prompts. This is an important development considering the practical applications which previous works have overlooked. Through evaluation on MemBench, we verify that the performance of existing image memorization mitigation methods is still insufficient for application to diffusion models. The code and datasets are available at https://github.com/chunsanHong/MemBench\_code.