CLApr 3, 2024

Scalable Model Editing via Customized Expert Networks

arXiv:2404.02699v24.86 citationsh-index: 2Has Code

Originality Incremental advance

AI Analysis

This addresses reliability issues in LLMs for applications requiring up-to-date information, though it is incremental as it builds on existing model editing approaches.

The paper tackles the problem of hallucinations and outdated knowledge in large language models by introducing SCEN, a two-stage method using lightweight expert networks and indexing neurons, achieving state-of-the-art results on benchmarks like ZsRE and Hallucination with Llama2.

Addressing the issues of hallucinations and outdated knowledge in large language models is critical for their reliable application. Model Editing presents a promising avenue for mitigating these challenges in a cost-effective manner. However, existing methods often suffer from unsatisfactory generalization and unintended effects on non-edited samples. To overcome these limitations, we introduce a novel approach: Scalable Model Editing via Customized Expert Networks (SCEN), which is a two-stage continuous training paradigm. Specifically, in the first stage, we train lightweight expert networks individually for each piece of knowledge that needs to be updated. Subsequently, we train a corresponding indexing neuron for each expert to control the activation state of that expert. We conducted a series of experiments on the ZsRE and Hallucination benchmarks by tuning the advanced open-source LLM, Llama2, achieving state-of-the-art results compared to current mainstream methods. Our code is available at https://github.com/TAL-auroraX/SCEN.

View on arXiv PDF Code

Similar