CLOct 12, 2024

Keys to Robust Edits: from Theoretical Insights to Practical Advances

Jianhao Yan, Futing Wang, Yun Luo, Yafu Li, Yue Zhang

TencentTsinghua

arXiv:2410.09338v23.43 citationsh-index: 13Has CodeACL

Originality Incremental advance

AI Analysis

This addresses robustness failures in knowledge editing for LLMs, offering a practical advance over existing locate-and-edit methods, though it builds incrementally on prior work.

The paper tackles the problem of large language models struggling with inaccurate knowledge due to conflicting or outdated memories, by introducing the Robust Edit Pathway (REP) module, which improves success rates in robustness tests by up to 66.4% while maintaining unaffected success rates.

Large language models (LLMs) struggle with maintaining accurate knowledge due to conflicting/outdated parametric memories. While locate-and-edit methods address this, their reliance on models' internal representations leads to robustness failures in long-context reasoning and paraphrased queries. We identify a fundamental limitation of locate-and-edit methods: existing semantic keys (for memory localization) cannot simultaneously satisfy robustness (context-invariant activation) and specificity (precise knowledge discrimination). Through theoretical error-bound analysis, we establish formal criteria for effective editing. Our solution introduces \textit{Robust Edit Pathway (REP)}, a plug-and-play module that: (1) disentangles editing keys from native model representations; (2) dynamically adjusts keys via contrastive learning to achieve robustness-specificity balance. Extensive experiments across various editing methods (ROME/MEMIT/R-ROME/EMMET), existing LLMs (LLaMA2, QWen, Mistral), and datasets (CounterFact, ZsRE) show that REP improves success rate over robustness tests by up-to 66.4\% while maintaining the success rate unaffected. Our code can be found at https://github.com/ElliottYan/RobustKeyEdit .

View on arXiv PDF Code

Similar