Localized Trajectories for 2D and 3D Action Recognition
This work addresses action recognition for scenarios with noise and background motion, offering an incremental improvement over Dense Trajectories.
The paper tackles the problem of irrelevant motion trajectories in action recognition by proposing Localized Trajectories, which cluster trajectories around human body joints using RGB-D cameras and encode them with Bag-of-Words, resulting in improved discriminative representation compared to Dense Trajectories, with extensive experiments on five datasets.
The Dense Trajectories concept is one of the most successful approaches in action recognition, suitable for scenarios involving a significant amount of motion. However, due to noise and background motion, many generated trajectories are irrelevant to the actual human activity and can potentially lead to performance degradation. In this paper, we propose Localized Trajectories as an improved version of Dense Trajectories where motion trajectories are clustered around human body joints provided by RGB-D cameras and then encoded by local Bag-of-Words. As a result, the Localized Trajectories concept provides a more discriminative representation of actions as compared to Dense Trajectories. Moreover, we generalize Localized Trajectories to 3D by using the modalities offered by RGB-D cameras. One of the main advantages of using RGB-D data to generate trajectories is that they include radial displacements that are perpendicular to the image plane. Extensive experiments and analysis are carried out on five different datasets.