RMAAT: Astrocyte-Inspired Memory Compression and Replay for Efficient Long-Context Transformers
This work addresses the efficiency problem for long-context Transformer models, offering a novel bio-inspired approach that is incremental in combining existing ideas with astrocyte dynamics.
The paper tackled the quadratic complexity of self-attention in Transformers for long sequences by introducing RMAAT, an architecture inspired by astrocyte functionalities, which achieved competitive accuracy and substantial improvements in computational and memory efficiency on the Long Range Arena benchmark.
The quadratic complexity of self-attention mechanism presents a significant impediment to applying Transformer models to long sequences. This work explores computational principles derived from astrocytes-glial cells critical for biological memory and synaptic modulation-as a complementary approach to conventional architectural modifications for efficient self-attention. We introduce the Recurrent Memory Augmented Astromorphic Transformer (RMAAT), an architecture integrating abstracted astrocyte functionalities. RMAAT employs a recurrent, segment-based processing strategy where persistent memory tokens propagate contextual information. An adaptive compression mechanism, governed by a novel retention factor derived from simulated astrocyte long-term plasticity (LTP), modulates these tokens. Attention within segments utilizes an efficient, linear-complexity mechanism inspired by astrocyte short-term plasticity (STP). Training is performed using Astrocytic Memory Replay Backpropagation (AMRB), a novel algorithm designed for memory efficiency in recurrent networks. Evaluations on the Long Range Arena (LRA) benchmark demonstrate RMAAT's competitive accuracy and substantial improvements in computational and memory efficiency, indicating the potential of incorporating astrocyte-inspired dynamics into scalable sequence models.