LGCLSPMay 31

Beyond Sinusoids: A Morlet Wavelet Framework for Transformer Positional Encoding

arXiv:2606.0125812.4
Predicted impact top 89% in LG · last 90 daysOriginality Incremental advance
AI Analysis

For transformer practitioners, MoPE offers a principled way to learn locality in positional encoding, but the empirical gain is demonstrated only on a small character-level dataset, making the result incremental.

The paper proposes Morlet Positional Encoding (MoPE), which learns per-dimension frequency and locality bandwidth, unifying sinusoidal and rotary encodings as limiting cases. Combined with Energy-Gated Attention, it achieves a +0.119 improvement over standard attention on TinyShakespeare.

Standard positional encodings for transformers - sinusoidal and rotary (RoPE) - treat every position as equally local: they encode where a token is, but not how far its positional influence should extend. We propose that the Morlet wavelet, which simultaneously minimises uncertainty in position and frequency, is the natural basis for positional encoding, and introduce Morlet Positional Encoding (MoPE): each embedding dimension learns its own frequency and locality bandwidth from data. The main theoretical result is a unification: sinusoidal PE and the RoPE correlation kernel both emerge as limiting cases of MoPE when locality is switched off (sigma_i -> infinity). The phase of MoPE recovers the RoPE rotation angle exactly; the amplitude adds a learned Gaussian locality kernel that standard encodings lack. Empirically, MoPE combined with Energy-Gated Attention achieves +0.119 improvement over standard attention on TinyShakespeare, outperforming either component alone. Analysis of the learned parameters reveals that all 128 frequency-bandwidth pairs converge to the wavelet admissibility boundary - an empirical observation consistent with a companion result on energy gating, suggesting a reproducible property of character-level language signals that warrants further investigation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes