A Cosmic-Scale Benchmark for Symmetry-Preserving Data Processing
This work addresses the problem of efficient data processing with symmetry preservation for researchers in fields like graphics and atomistic modeling, but it is incremental as it benchmarks existing methods on new data.
The paper tackled the challenge of processing structured point cloud data while preserving multiscale information by benchmarking graph neural networks on a simulated galaxy dataset, showing that Euclidean symmetry-preserving networks outperform non-equivariant counterparts and domain-specific techniques in downstream performance and simulation-efficiency, but fail to capture long-range correlations as effectively as baselines.
Efficiently processing structured point cloud data while preserving multiscale information is a key challenge across domains, from graphics to atomistic modeling. Using a curated dataset of simulated galaxy positions and properties, represented as point clouds, we benchmark the ability of graph neural networks to simultaneously capture local clustering environments and long-range correlations. Given the homogeneous and isotropic nature of the Universe, the data exhibits a high degree of symmetry. We therefore focus on evaluating the performance of Euclidean symmetry-preserving ($E(3)$-equivariant) graph neural networks, showing that they can outperform non-equivariant counterparts and domain-specific information extraction techniques in downstream performance as well as simulation-efficiency. However, we find that current architectures fail to capture information from long-range correlations as effectively as domain-specific baselines, motivating future work on architectures better suited for extracting long-range information.