KnowledgeSmith: Uncovering Knowledge Updating in LLMs with Model Editing and Unlearning
This work addresses the need for more reliable and scalable knowledge updating strategies in LLMs, though it is incremental in exploring mechanisms rather than introducing new methods.
The paper tackles the problem of understanding how large language models (LLMs) update knowledge through editing and unlearning, finding that LLMs do not update knowledge similarly to humans across different levels and reveal a consistency-capacity trade-off.
Knowledge editing and machine unlearning are two popular approaches for large language models (LLMs) to stay up-to-date. However, the knowledge updating mechanism of LLMs remains largely unexplored due to insufficient, isolated, and small-scale evaluation. For instance, are LLMs similar to humans in modifying certain knowledge? What differs editing and unlearning as training data increases? This paper proposes KnowledgeSmith, a unified framework to systematically understand the updating mechanism of LLMs. We first cast editing and unlearning as instances of one constrained optimization problem. Then, we propose an automatic dataset generator that provides structured interventions across multiple graph levels and data scales, enabling controlled studies of how different modification strategies propagate through model knowledge. Extensive experiments demonstrate nuanced insights over knowledge propagation, plasticity scaling, consistency, and robustness. For instance, our results show that LLMs do not exhibit similar updating as humans for different levels of knowledge, and there exists consistency-capacity trade-off. We hope our findings can offer suggestions to the design of more reliable and scalable strategies. Code: https://github.com/AIFrontierLab/KnowledgeSmith.git