Intrinsic-Extrinsic Convolution and Pooling for Learning on 3D Protein Structures
This work addresses the need for scalable and tailored learning algorithms for 3D protein data, which is crucial for biological research, though it appears incremental in improving existing methods.
The paper tackled the problem of processing 3D protein structures by proposing new convolution and pooling operations that consider both intrinsic and extrinsic structural aspects, achieving state-of-the-art performance on large-scale datasets for protein analysis tasks.
Proteins perform a large variety of functions in living organisms, thus playing a key role in biology. As of now, available learning algorithms to process protein data do not consider several particularities of such data and/or do not scale well for large protein conformations. To fill this gap, we propose two new learning operations enabling deep 3D analysis of large-scale protein data. First, we introduce a novel convolution operator which considers both, the intrinsic (invariant under protein folding) as well as extrinsic (invariant under bonding) structure, by using $n$-D convolutions defined on both the Euclidean distance, as well as multiple geodesic distances between atoms in a multi-graph. Second, we enable a multi-scale protein analysis by introducing hierarchical pooling operators, exploiting the fact that proteins are a recombination of a finite set of amino acids, which can be pooled using shared pooling matrices. Lastly, we evaluate the accuracy of our algorithms on several large-scale data sets for common protein analysis tasks, where we outperform state-of-the-art methods.