LGOct 17, 2023

Heterogenous Memory Augmented Neural Networks

arXiv:2310.10909v1h-index: 8Has Code
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in semi-parametric methods for researchers and practitioners dealing with limited data and OOD generalization, though it is incremental as it builds on existing memory-augmented neural networks.

The paper tackles the scalability and computational cost issues of semi-parametric methods in data scarcity and out-of-distribution scenarios by introducing a heterogeneous memory augmentation approach with learnable memory tokens, showing competitive performance against state-of-the-art methods on various image and graph-based tasks.

It has been shown that semi-parametric methods, which combine standard neural networks with non-parametric components such as external memory modules and data retrieval, are particularly helpful in data scarcity and out-of-distribution (OOD) scenarios. However, existing semi-parametric methods mostly depend on independent raw data points - this strategy is difficult to scale up due to both high computational costs and the incapacity of current attention mechanisms with a large number of tokens. In this paper, we introduce a novel heterogeneous memory augmentation approach for neural networks which, by introducing learnable memory tokens with attention mechanism, can effectively boost performance without huge computational overhead. Our general-purpose method can be seamlessly combined with various backbones (MLP, CNN, GNN, and Transformer) in a plug-and-play manner. We extensively evaluate our approach on various image and graph-based tasks under both in-distribution (ID) and OOD conditions and show its competitive performance against task-specific state-of-the-art methods. Code is available at \url{https://github.com/qiuzh20/HMA}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes