CLAIFeb 18, 2025

MeMo: Towards Language Models with Associative Memory Mechanisms

arXiv:2502.12851v13 citationsh-index: 14
Originality Highly original
AI Analysis

This work addresses the need for more transparent and editable language models, offering a new paradigm for memorization in AI.

The paper tackles the problem of memorization in Transformer-based Large Language Models by proposing MeMo, a novel architecture that explicitly memorizes token sequences in associative memories, enabling transparency and model editing such as forgetting texts.

Memorization is a fundamental ability of Transformer-based Large Language Models, achieved through learning. In this paper, we propose a paradigm shift by designing an architecture to memorize text directly, bearing in mind the principle that memorization precedes learning. We introduce MeMo, a novel architecture for language modeling that explicitly memorizes sequences of tokens in layered associative memories. By design, MeMo offers transparency and the possibility of model editing, including forgetting texts. We experimented with the MeMo architecture, showing the memorization power of the one-layer and the multi-layer configurations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes