LGAIDec 17, 2025

How Smoothing is N-simplicial Attention?

arXiv:2512.15600v1h-index: 1
Originality Incremental advance
AI Analysis

This work addresses a computational bottleneck in graph-based models for machine learning researchers, but it is incremental as it builds on existing attention paradigms.

The paper tackles the problem of extending attention mechanisms from pairwise to higher-order interactions by introducing N-simplicial attention with cost-effective simplex selection, and finds that it suffers from over-smoothing despite enabling more complex message-passing.

Going from pure Multilayer Perceptron (MLP) to a learnable graph message-passing mechanism at each layer has been foundational to state-of-the-art results, despite the computational trade-off (e.g. GATs or Transformers). To go a step further, in this work, we introduce N-simplicial attention, going from pairwise token similarity to higher-order interactions, and adapt it for Rotary Position Embeddings (RoPE). To help manage the increased complexity, we propose a cost-effective simplex selection enabling the model to focus its computation load onto the more task-sensitive interactions. Beyond these core mechanisms, we study how smoothing N-simplicial attention is by deriving a Lipschitz upper-bound and by demonstrating that by itself it also suffers from over-smoothing, despite opening the attention message-passing to higher-order interactions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes