LGMay 18

Function graph transformers universally approximate operators between function spaces

arXiv:2605.1796814.6
Predicted impact top 32% in LG · last 90 daysOriginality Highly original
AI Analysis

Provides a theoretical foundation for transformer-based operator learning, addressing discretization invariance and broad operator classes, which is foundational for scientific computing and ML/AI.

The paper introduces function graph transformers, a subclass of transformers that universally approximate nonlinear operators between function spaces, handling discretization invariance and negative-order Sobolev inputs. The approach lifts functions to graph measures and proves universal approximation via finite compositions of standard self-attention and MLPs.

We study the approximation of nonlinear operators between function spaces by transformers. Our approach is to lift functions to measures supported on their graphs and leverage a recently introduced measure-theoretic view of transformers. A function $h$ is represented by its graph measure $γ_h$, with finite tokens $\{(x_j,h(x_j))\}_{j=1}^N$ being its empirical approximations. We show that this framework elegantly models discretization refinement via convergence of measures and provides a natural setting for operator learning. Within this framework, we introduce function graph transformers, a graph-preserving subclass of measure-theoretic transformers that maps graph measures to graph measures, which is to say that outputs remain single-valued functions. Crucially, this additional structure does not reduce generality: we prove that the resulting graph-preserving maps can be approximated by finite compositions of standard softmax self-attention layers and pointwise MLPs, yielding universal approximation results for broad classes of nonlinear operators. Unlike existing theoretical approaches to operator learning with transformers, the measure-theoretic framework also accommodates regularized negative-order Sobolev inputs for which discretization invariance is particularly challenging, as well as query points on different output domains. Overall, function graph transformers provide a continuum viewpoint and mathematical toolkit for transformer-based operator learning, clarifying the roles of positional encodings, graph structure, regularization, and ensuring consistency across discretizations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes