LGAIRTMay 11

Oversmoothing as Representation Degeneracy in Neural Sheaf Diffusion

arXiv:2605.111788.3
Predicted impact top 93% in LG · last 90 daysOriginality Incremental advance
AI Analysis

For graph neural network researchers, the paper provides a new algebraic interpretation of oversmoothing and proposes regularizers to mitigate it, though the improvements are incremental and domain-specific.

The paper reframes oversmoothing in Neural Sheaf Diffusion as representation degeneration, where learned sheaf geometries collapse to low-complexity summands that lose discriminative information. It introduces moment-map-inspired regularizers to bias toward balanced geometries and shows that non-uniform stalk dimensions remove a structural obstruction in equal-stalk architectures, with experiments on heterophilic benchmarks supporting the mechanism.

Neural Sheaf Diffusion (NSD) generalizes diffusion-based Graph Neural Networks by replacing scalar graph Laplacians with sheaf Laplacians whose learned restriction maps define a task-adapted geometry. While the diffusion limit of NSD is known to be the space of global sections, the representation-theoretic structure of this harmonic space remains largely implicit. We develop a quiver-theoretic interpretation of NSD by identifying cellular sheaves on graphs with representations of the associated incidence quiver. Under this correspondence, learned sheaf geometries become points in a finite-dimensional representation space. We show that direct-sum decompositions of the underlying incidence-quiver representation induce decompositions of the harmonic space reached in the diffusion limit. This gives an algebraic interpretation of oversmoothing as representation degeneration: learned sheaves may collapse toward low-complexity summands whose global sections fail to preserve discriminative information. Building on this viewpoint, we connect sheaf diffusion to stability and moment-map principles from Geometric Invariant Theory. We introduce moment-map-inspired regularizers that bias restriction maps toward balanced representation geometries, and identify a structural obstruction in equal-stalk architectures: when $d_v = d_e$, admissibility for learnable stability parameters forces the trivial all-object summand onto a stability wall. Non-uniform stalk dimensions remove this obstruction, making adaptive stability meaningful. Experiments on heterophilic benchmarks are consistent with this mechanism: breaking stalk symmetry can reduce variance or improve validation behavior, and adaptive stability becomes more effective in selected rectangular settings. Overall, our framework reframes oversmoothing as a degeneration phenomenon in the representation geometry underlying learned sheaf diffusion.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes