Todyformer: Towards Holistic Dynamic Graph Transformers with Structure-Aware Tokenization
This addresses performance bottlenecks in dynamic graph modeling for researchers and practitioners, though it appears incremental as it builds on existing Transformer and MPNN techniques.
The authors tackled the limitations of Temporal Graph Neural Networks, such as over-squashing and over-smoothing, by introducing Todyformer, a Transformer-based model for dynamic graphs that unifies local and global encoding, resulting in consistent outperformance of state-of-the-art methods on benchmark datasets.
Temporal Graph Neural Networks have garnered substantial attention for their capacity to model evolving structural and temporal patterns while exhibiting impressive performance. However, it is known that these architectures are encumbered by issues that constrain their performance, such as over-squashing and over-smoothing. Meanwhile, Transformers have demonstrated exceptional computational capacity to effectively address challenges related to long-range dependencies. Consequently, we introduce Todyformer-a novel Transformer-based neural network tailored for dynamic graphs. It unifies the local encoding capacity of Message-Passing Neural Networks (MPNNs) with the global encoding of Transformers through i) a novel patchifying paradigm for dynamic graphs to improve over-squashing, ii) a structure-aware parametric tokenization strategy leveraging MPNNs, iii) a Transformer with temporal positional-encoding to capture long-range dependencies, and iv) an encoding architecture that alternates between local and global contextualization, mitigating over-smoothing in MPNNs. Experimental evaluations on public benchmark datasets demonstrate that Todyformer consistently outperforms the state-of-the-art methods for downstream tasks. Furthermore, we illustrate the underlying aspects of the proposed model in effectively capturing extensive temporal dependencies in dynamic graphs.