LGAIJan 23

E2Former-V2: On-the-Fly Equivariant Attention with Linear Activation Memory

arXiv:2601.16622v12 citationsh-index: 5Has Code
Originality Highly original
AI Analysis

This work addresses efficiency issues in modeling 3D atomistic systems for researchers and practitioners in computational chemistry and materials science, representing an incremental improvement with novel optimizations.

The paper tackles the scalability bottleneck in Equivariant Graph Neural Networks for 3D atomistic systems by introducing E2Former-V2, which uses algebraic sparsity and hardware-aware execution to achieve a 20× improvement in TFLOPS while maintaining comparable predictive performance on SPICE and OMol25 datasets.

Equivariant Graph Neural Networks (EGNNs) have become a widely used approach for modeling 3D atomistic systems. However, mainstream architectures face critical scalability bottlenecks due to the explicit construction of geometric features or dense tensor products on \textit{every} edge. To overcome this, we introduce \textbf{E2Former-V2}, a scalable architecture that integrates algebraic sparsity with hardware-aware execution. We first propose \textbf{E}quivariant \textbf{A}xis-\textbf{A}ligned \textbf{S}parsification (EAAS). EAAS builds on Wigner-$6j$ convolution by exploiting an $\mathrm{SO}(3) \rightarrow \mathrm{SO}(2)$ change of basis to transform computationally expensive dense tensor contractions into efficient, sparse parity re-indexing operations. Building on this representation, we introduce \textbf{On-the-Fly Equivariant Attention}, a fully node-centric mechanism implemented via a custom fused Triton kernel. By eliminating materialized edge tensors and maximizing SRAM utilization, our kernel achieves a \textbf{20$\times$ improvement in TFLOPS} compared to standard implementations. Extensive experiments on the SPICE and OMol25 datasets demonstrate that E2Former-V2 maintains comparable predictive performance while notably accelerating inference. This work demonstrates that large equivariant transformers can be trained efficiently using widely accessible GPU platforms. The code is avalible at https://github.com/IQuestLab/UBio-MolFM/tree/e2formerv2.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes