KShapeNet: Riemannian network on Kendall shape space for Skeleton based Action Recognition
This work provides an incremental improvement in skeleton-based action recognition for computer vision researchers by leveraging geometric properties of skeleton data.
The paper addresses skeleton-based action recognition by modeling skeleton sequences as trajectories on Kendall's shape space and then mapping them to a linear tangent space. This approach, combined with a deep learning architecture including a layer for optimizing rigid and non-rigid transformations and a CNN-LSTM network, outperforms existing geometric deep learning methods and is competitive with recently published approaches on NTU-RGB+D and NTU-RGB+D 120 datasets.
Deep Learning architectures, albeit successful in most computer vision tasks, were designed for data with an underlying Euclidean structure, which is not usually fulfilled since pre-processed data may lie on a non-linear space. In this paper, we propose a geometry aware deep learning approach for skeleton-based action recognition. Skeleton sequences are first modeled as trajectories on Kendall's shape space and then mapped to the linear tangent space. The resulting structured data are then fed to a deep learning architecture, which includes a layer that optimizes over rigid and non rigid transformations of the 3D skeletons, followed by a CNN-LSTM network. The assessment on two large scale skeleton datasets, namely NTU-RGB+D and NTU-RGB+D 120, has proven that proposed approach outperforms existing geometric deep learning methods and is competitive with respect to recently published approaches.