CLApr 17, 2024

MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory

Ali Modarressi, Abdullatif Köksal, Ayyoob Imani, Mohsen Fayyaz, Hinrich Schütze

arXiv:2404.11672v314.630 citationsh-index: 13Has CodeTrans. Mach. Learn. Res.

Originality Highly original

AI Analysis

This addresses the issue of hallucination and memory limitations in LLMs for AI applications, representing a novel approach rather than an incremental improvement.

The paper tackles the problem of LLMs relying on implicit parametric memory, which limits memorization of rare events and updating facts over time, by introducing MemLLM, a method that integrates an explicit read-write memory module to enhance performance and interpretability in language modeling and knowledge-intensive tasks.

While current large language models (LLMs) perform well on many knowledge-related tasks, they are limited by relying on their parameters as an implicit storage mechanism. As a result, they struggle with memorizing rare events and with updating their memory as facts change over time. In addition, the uninterpretable nature of parametric memory makes it challenging to prevent hallucination. Model editing and augmenting LLMs with parameters specialized for memory are only partial solutions. In this paper, we introduce MemLLM, a novel method of enhancing LLMs by integrating a structured and explicit read-and-write memory module. MemLLM tackles the aforementioned challenges by enabling dynamic interaction with the memory and improving the LLM's capabilities in using stored knowledge. Our experiments indicate that MemLLM enhances the LLM's performance and interpretability, in language modeling in general and knowledge-intensive tasks in particular. We see MemLLM as an important step towards making LLMs more grounded and factual through memory augmentation. The project repository is publicly available at https://github.com/amodaresi/MemLLM

View on arXiv PDF Code

Similar