CLMay 9, 2025

Graph Laplacian Wavelet Transformer via Learnable Spectral Decomposition

arXiv:2505.07862v11 citationsh-index: 3
Originality Incremental advance
AI Analysis

This addresses efficiency and interpretability issues in structured language tasks for NLP researchers and practitioners, though it appears incremental as it modifies existing transformer architectures.

The paper tackles the quadratic complexity bottleneck of dot-product self-attention in sequence-to-sequence models by introducing the Graph Wavelet Transformer (GWT), which replaces it with a learnable multi-scale wavelet transform based on graph Laplacians from syntactic/semantic parses, achieving comparable performance with linear complexity.

Existing sequence to sequence models for structured language tasks rely heavily on the dot product self attention mechanism, which incurs quadratic complexity in both computation and memory for input length N. We introduce the Graph Wavelet Transformer (GWT), a novel architecture that replaces this bottleneck with a learnable, multi scale wavelet transform defined over an explicit graph Laplacian derived from syntactic or semantic parses. Our analysis shows that multi scale spectral decomposition offers an interpretable, efficient, and expressive alternative to quadratic self attention for graph structured sequence modeling.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes