CLAIApr 1, 2025

$μ$KE: Matryoshka Unstructured Knowledge Editing of Large Language Models

arXiv:2504.01196v22 citationsh-index: 12
Originality Incremental advance
AI Analysis

This work addresses the problem of efficiently updating knowledge in LLMs for users needing accurate and safe model outputs, representing an incremental improvement over existing editing methods.

The paper tackles the problem of editing knowledge in large language models to address issues like hallucinations and safety risks, introducing a novel memory update mechanism that improves edit efficacy by up to 12.33% over state-of-the-art methods.

Large language models (LLMs) have emerged as powerful knowledge bases yet are limited by static training data, leading to issues such as hallucinations and safety risks. Editing a model's internal knowledge through the locate-and-edit paradigm has proven a cost-effective alternative to retraining, though current unstructured approaches, especially window-based autoregressive methods, often disrupt the causal dependency between early memory updates and later output tokens. In this work, we first theoretically analyze these limitations and then introduce Matryoshka Unstructured Knowledge Editing ($μ$KE), a novel memory update mechanism that preserves such dependencies via a Matryoshka-style objective and adaptive loss coefficients. Empirical evaluations on two models across four benchmarks demonstrate that $μ$KE improves edit efficacy by up to 12.33% over state-of-the-art methods, and remains robust when applied to diverse formatted edits, underscoring its potential for effective unstructured knowledge editing in LLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes