LG MLMar 18, 2024

PARMESAN: Parameter-Free Memory Search and Transduction for Dense Prediction Tasks

Philip Matthias Winter, Maria Wimmer, David Major, Dimitrios Lenis, Astrid Berg, Theresa Neubauer, Gaia Romana De Paolis, Johannes Novotny, Sophia Ulonska, Katja Bühler

arXiv:2403.11743v34.61 citationsh-index: 8GCPR

Originality Incremental advance

AI Analysis

This addresses the problem of inefficient adaptation in continual learning for AI practitioners, offering a scalable and hardware-efficient solution, though it appears incremental as it builds on memory-based and transductive approaches.

The paper tackles the lack of flexibility in deep learning for adaptation to new data and tasks, such as continual learning, by proposing PARMESAN, a parameter-free method that uses memory search and transduction for dense prediction tasks, achieving learning speeds 3-4 orders of magnitude faster than baselines while maintaining competitive performance.

This work addresses flexibility in deep learning by means of transductive reasoning. For adaptation to new data and tasks, e.g., in continual learning, existing methods typically involve tuning learnable parameters or complete re-training from scratch, rendering such approaches unflexible in practice. We argue that the notion of separating computation from memory by the means of transduction can act as a stepping stone for solving these issues. We therefore propose PARMESAN (parameter-free memory search and transduction), a scalable method which leverages a memory module for solving dense prediction tasks. At inference, hidden representations in memory are being searched to find corresponding patterns. In contrast to other methods that rely on continuous training of learnable parameters, PARMESAN learns via memory consolidation simply by modifying stored contents. Our method is compatible with commonly used architectures and canonically transfers to 1D, 2D, and 3D grid-based data. The capabilities of our approach are demonstrated at the complex task of continual learning. PARMESAN learns by 3-4 orders of magnitude faster than established baselines while being on par in terms of predictive performance, hardware-efficiency, and knowledge retention.

View on arXiv PDF

Similar