LGAICVFeb 24, 2025

Erwin: A Tree-based Hierarchical Transformer for Large-scale Physical Systems

arXiv:2502.17019v220 citationsh-index: 10ICML
Originality Highly original
AI Analysis

This addresses computational bottlenecks for researchers and practitioners in physics and engineering dealing with large-scale systems like cosmology and fluid dynamics, representing a novel method rather than an incremental improvement.

The paper tackles the scalability challenge of deep learning for large-scale physical systems on irregular grids by introducing Erwin, a hierarchical transformer that combines tree-based algorithms with attention mechanisms, achieving linear-time attention and outperforming baseline methods in accuracy and computational efficiency across multiple domains.

Large-scale physical systems defined on irregular grids pose significant scalability challenges for deep learning methods, especially in the presence of long-range interactions and multi-scale coupling. Traditional approaches that compute all pairwise interactions, such as attention, become computationally prohibitive as they scale quadratically with the number of nodes. We present Erwin, a hierarchical transformer inspired by methods from computational many-body physics, which combines the efficiency of tree-based algorithms with the expressivity of attention mechanisms. Erwin employs ball tree partitioning to organize computation, which enables linear-time attention by processing nodes in parallel within local neighborhoods of fixed size. Through progressive coarsening and refinement of the ball tree structure, complemented by a novel cross-ball interaction mechanism, it captures both fine-grained local details and global features. We demonstrate Erwin's effectiveness across multiple domains, including cosmology, molecular dynamics, PDE solving, and particle fluid dynamics, where it consistently outperforms baseline methods both in accuracy and computational efficiency.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes