ReFACT: Updating Text-to-Image Models by Editing the Text Encoder
This addresses the challenge of keeping text-to-image models current for end-users, though it is an incremental improvement over existing editing techniques.
The paper tackles the problem of outdated factual associations in text-to-image models by introducing ReFACT, a method that edits the text encoder to update these associations without costly re-training, achieving superior performance in generalization and preservation compared to other methods.
Our world is marked by unprecedented technological, global, and socio-political transformations, posing a significant challenge to text-to-image generative models. These models encode factual associations within their parameters that can quickly become outdated, diminishing their utility for end-users. To that end, we introduce ReFACT, a novel approach for editing factual associations in text-to-image models without relaying on explicit input from end-users or costly re-training. ReFACT updates the weights of a specific layer in the text encoder, modifying only a tiny portion of the model's parameters and leaving the rest of the model unaffected. We empirically evaluate ReFACT on an existing benchmark, alongside a newly curated dataset. Compared to other methods, ReFACT achieves superior performance in both generalization to related concepts and preservation of unrelated concepts. Furthermore, ReFACT maintains image generation quality, making it a practical tool for updating and correcting factual information in text-to-image models.