3D Human Action Recognition with Siamese-LSTM Based Deep Metric Learning
This work addresses action recognition for computer vision applications, but it appears incremental as it builds on existing deep learning and metric learning approaches.
The paper tackles 3D human action recognition by proposing a two-phase system using Siamese-LSTM networks for deep metric learning and a multiclass classification module, with initial results reported as promising on standard and new datasets.
This paper proposes a new 3D Human Action Recognition system as a two-phase system: (1) Deep Metric Learning Module which learns a similarity metric between two 3D joint sequences using Siamese-LSTM networks; (2) A Multiclass Classification Module that uses the output of the first module to produce the final recognition output. This model has several advantages: the first module is trained with a larger set of data because it uses many combinations of sequence pairs.Our deep metric learning module can also be trained independently of the datasets, which makes our system modular and generalizable. We tested the proposed system on standard and newly introduced datasets that showed us that initial results are promising. We will continue developing this system by adding more sophisticated LSTM blocks and by cross-training between different datasets.