PLS in the Mirror of Self-Attention
Provides a theoretical connection between PLS and self-attention, offering a new perspective for researchers in both fields, but the contribution is primarily conceptual and incremental.
The paper shows that partial least squares (PLS) can be viewed as a linearized self-attention mechanism, bridging classical statistics and neural networks. It suggests that self-attention inherently includes dimensionality normalization, potentially improving learning.
This note provides an interesting observation on casting partial least square (PLS) as a linearized self-attention so that PLS may be studied within the neural network paradigm. On the other hand, the dimensionality reduction and selection of predictors in PLS may indicate that self-attention includes certain degree of dimensionality normalization toward improved learning.