Learning Class Regularized Features for Action Recognition
This addresses the issue of subtle differences between similar classes being modeled similarly to large differences in dissimilar classes for action recognition, but it is incremental as it builds on existing CNN architectures.
The paper tackled the problem of class-agnostic feature extraction in CNNs for action recognition by introducing Class Regularization to perform class-based regularization of layer activations, resulting in systematic improvement gains of 1.8%, 1.2%, and 1.4% on Kinetics, UCF-101, and HMDB-51 datasets, respectively.
Training Deep Convolutional Neural Networks (CNNs) is based on the notion of using multiple kernels and non-linearities in their subsequent activations to extract useful features. The kernels are used as general feature extractors without specific correspondence to the target class. As a result, the extracted features do not correspond to specific classes. Subtle differences between similar classes are modeled in the same way as large differences between dissimilar classes. To overcome the class-agnostic use of kernels in CNNs, we introduce a novel method named Class Regularization that performs class-based regularization of layer activations. We demonstrate that this not only improves feature search during training, but also allows an explicit assignment of features per class during each stage of the feature extraction process. We show that using Class Regularization blocks in state-of-the-art CNN architectures for action recognition leads to systematic improvement gains of 1.8%, 1.2% and 1.4% on the Kinetics, UCF-101 and HMDB-51 datasets, respectively.