LGFeb 4

MirrorLA: Reflecting Feature Map for Vision Linear Attention

arXiv:2602.04346v11 citationsh-index: 2
Originality Highly original
AI Analysis

This work solves the problem of degraded performance in linear attention for vision tasks, offering a more efficient alternative without sacrificing accuracy.

The paper tackles the performance gap between linear and softmax-based attention in Transformers by addressing the information loss from non-negativity constraints in kernel feature maps, and it achieves state-of-the-art results on standard benchmarks while maintaining linear computational efficiency.

Linear attention significantly reduces the computational complexity of Transformers from quadratic to linear, yet it consistently lags behind softmax-based attention in performance. We identify the root cause of this degradation as the non-negativity constraint imposed on kernel feature maps: standard projections like ReLU act as "passive truncation" operators, indiscriminately discarding semantic information residing in the negative domain. We propose MirrorLA, a geometric framework that substitutes passive truncation with active reorientation. By leveraging learnable Householder reflections, MirrorLA rotates the feature geometry into the non-negative orthant to maximize information retention. Our approach restores representational density through a cohesive, multi-scale design: it first optimizes local discriminability via block-wise isometries, stabilizes long-context dynamics using variance-aware modulation to diversify activations, and finally, integrates dispersed subspaces via cross-head reflections to induce global covariance mixing. MirrorLA achieves state-of-the-art performance across standard benchmarks, demonstrating that strictly linear efficiency can be achieved without compromising representational fidelity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes