GeSS: Benchmarking Geometric Deep Learning under Scientific Applications with Distribution Shifts
This work addresses a gap for GDL researchers and domain practitioners by providing a benchmark to assess model robustness in scientific scenarios with distribution shifts, though it is incremental as it focuses on benchmarking rather than new methods.
The authors tackled the lack of benchmarks for evaluating geometric deep learning (GDL) models under distribution shifts in scientific applications by proposing GeSS, a comprehensive benchmark covering diverse domains and shift types, resulting in 30 experiment settings and evaluating 3 GDL backbones and 11 learning algorithms.
Geometric deep learning (GDL) has gained significant attention in scientific fields, for its proficiency in modeling data with intricate geometric structures. However, very few works have delved into its capability of tackling the distribution shift problem, a prevalent challenge in many applications. To bridge this gap, we propose GeSS, a comprehensive benchmark designed for evaluating the performance of GDL models in scientific scenarios with distribution shifts. Our evaluation datasets cover diverse scientific domains from particle physics, materials science to biochemistry, and encapsulate a broad spectrum of distribution shifts including conditional, covariate, and concept shifts. Furthermore, we study three levels of information access from the out-of-distribution (OOD) test data, including no OOD information, only unlabeled OOD data, and OOD data with a few labels. Overall, our benchmark results in 30 different experiment settings, and evaluates 3 GDL backbones and 11 learning algorithms in each setting. A thorough analysis of the evaluation results is provided, poised to illuminate insights for GDL researchers and domain practitioners who are to use GDL in their applications.