LGJun 21, 2023

Constant Memory Attention Block

arXiv:2306.12599v1h-index: 57
Originality Incremental advance
AI Analysis

This addresses memory limitations in low-compute domains for applications like Neural Processes and Temporal Point Processes, representing an incremental improvement in efficiency.

The paper tackles the high memory requirements of attention mechanisms in foundation models by proposing the Constant Memory Attention Block (CMAB), which computes outputs in constant memory and updates in constant computation, achieving competitive results with state-of-the-art methods while being significantly more memory efficient.

Modern foundation model architectures rely on attention mechanisms to effectively capture context. However, these methods require linear or quadratic memory in terms of the number of inputs/datapoints, limiting their applicability in low-compute domains. In this work, we propose Constant Memory Attention Block (CMAB), a novel general-purpose attention block that computes its output in constant memory and performs updates in constant computation. Highlighting CMABs efficacy, we introduce methods for Neural Processes and Temporal Point Processes. Empirically, we show our proposed methods achieve results competitive with state-of-the-art while being significantly more memory efficient.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes