Polyatomic Complexes: A topologically-informed learning representation for atomistic systems
This work addresses the problem of representing chemical structures for machine learning models in computational chemistry, though it appears incremental as it matches rather than surpasses existing methods.
The authors tackled the challenge of developing robust chemical structure representations by introducing a topologically-informed learning representation for atomistic systems, proving it satisfies structural, geometric, efficiency, and generalizability constraints and reporting performance comparable to state-of-the-art methods on numerous tasks.
Developing robust representations of chemical structures that enable models to learn topological inductive biases is challenging. In this manuscript, we present a representation of atomistic systems. We begin by proving that our representation satisfies all structural, geometric, efficiency, and generalizability constraints. Afterward, we provide a general algorithm to encode any atomistic system. Finally, we report performance comparable to state-of-the-art methods on numerous tasks. We open-source all code and datasets. The code and data are available at https://github.com/rahulkhorana/PolyatomicComplexes.