LG AI ME MLFeb 3, 2023

ResMem: Learn what you can and memorize the rest

Zitong Yang, Michal Lukasik, Vaishnavh Nagarajan, Zonglin Li, Ankit Singh Rawat, Manzil Zaheer, Aditya Krishna Menon, Sanjiv Kumar

arXiv:2302.01576v216.012 citationsh-index: 46

Originality Incremental advance

AI Analysis

This addresses the challenge of boosting generalization in machine learning models, though it is incremental as it builds on existing prediction methods.

The paper tackled the problem of improving neural network generalization by proposing ResMem, a method that explicitly memorizes training residuals using k-nearest neighbors, and showed it consistently enhances test performance across vision and NLP benchmarks.

The impressive generalization performance of modern neural networks is attributed in part to their ability to implicitly memorize complex training patterns. Inspired by this, we explore a novel mechanism to improve model generalization via explicit memorization. Specifically, we propose the residual-memorization (ResMem) algorithm, a new method that augments an existing prediction model (e.g. a neural network) by fitting the model's residuals with a $k$-nearest neighbor based regressor. The final prediction is then the sum of the original model and the fitted residual regressor. By construction, ResMem can explicitly memorize the training labels. Empirically, we show that ResMem consistently improves the test set generalization of the original prediction model across various standard vision and natural language processing benchmarks. Theoretically, we formulate a stylized linear regression problem and rigorously show that ResMem results in a more favorable test risk over the base predictor.

View on arXiv PDF

Similar