Channel DropBlock: An Improved Regularization Method for Fine-Grained Visual Classification
This work addresses the challenge of classifying sub-categories within the same super-category in computer vision, offering an incremental improvement over existing methods.
The paper tackles the problem of fine-grained visual classification by proposing Channel DropBlock, a regularization method that randomly masks correlated channels during training to enhance feature representations, achieving improved performance on three benchmark datasets.
Classifying the sub-categories of an object from the same super-category (e.g., bird) in a fine-grained visual classification (FGVC) task highly relies on mining multiple discriminative features. Existing approaches mainly tackle this problem by introducing attention mechanisms to locate the discriminative parts or feature encoding approaches to extract the highly parameterized features in a weakly-supervised fashion. In this work, we propose a lightweight yet effective regularization method named Channel DropBlock (CDB), in combination with two alternative correlation metrics, to address this problem. The key idea is to randomly mask out a group of correlated channels during training to destruct features from co-adaptations and thus enhance feature representations. Extensive experiments on three benchmark FGVC datasets show that CDB effectively improves the performance.