Bridging Associative Memory and Probabilistic Modeling
This work addresses the integration of two fundamental AI topics, potentially benefiting researchers in both fields by facilitating cross-disciplinary insights, though it appears incremental in bridging existing concepts.
The paper connects associative memory and probabilistic modeling by showing that energy functions in associative memory correspond to negative log likelihoods in probabilistic modeling, enabling bidirectional idea exchange. It demonstrates this through four examples, including new energy-based models for in-context learning and novel associative memory models using Bayesian nonparametrics.
Associative memory and probabilistic modeling are two fundamental topics in artificial intelligence. The first studies recurrent neural networks designed to denoise, complete and retrieve data, whereas the second studies learning and sampling from probability distributions. Based on the observation that associative memory's energy functions can be seen as probabilistic modeling's negative log likelihoods, we build a bridge between the two that enables useful flow of ideas in both directions. We showcase four examples: First, we propose new energy-based models that flexibly adapt their energy functions to new in-context datasets, an approach we term \textit{in-context learning of energy functions}. Second, we propose two new associative memory models: one that dynamically creates new memories as necessitated by the training data using Bayesian nonparametrics, and another that explicitly computes proportional memory assignments using the evidence lower bound. Third, using tools from associative memory, we analytically and numerically characterize the memory capacity of Gaussian kernel density estimators, a widespread tool in probababilistic modeling. Fourth, we study a widespread implementation choice in transformers -- normalization followed by self attention -- to show it performs clustering on the hypersphere. Altogether, this work urges further exchange of useful ideas between these two continents of artificial intelligence.