3D Convolutional Networks for Action Recognition: Application to Sport Gesture Recognition
This work addresses action recognition in sports videos, but it appears incremental as it applies an existing 3D convnet method to a specific domain without major innovations.
The paper tackled the problem of classifying continuous video sequences with repeatable actions, specifically table tennis strokes, using 3D convolutional networks for segmentation and classification in a marker-free environment, achieving efficient results with window-based approaches.
3D convolutional networks is a good means to perform tasks such as video segmentation into coherent spatio-temporal chunks and classification of them with regard to a target taxonomy. In the chapter we are interested in the classification of continuous video takes with repeatable actions, such as strokes of table tennis. Filmed in a free marker less ecological environment, these videos represent a challenge from both segmentation and classification point of view. The 3D convnets are an efficient tool for solving these problems with window-based approaches.