Mining Discriminative Triplets of Patches for Fine-Grained Classification
This addresses the problem of distinguishing similar sub-categories in fine-grained classification for computer vision applications, representing an incremental improvement.
The paper tackles the challenge of accurately localizing discriminative regions for fine-grained classification by introducing a patch-based framework with geometrically-constrained triplets, achieving state-of-the-art or comparable performance on four public datasets.
Fine-grained classification involves distinguishing between similar sub-categories based on subtle differences in highly localized regions; therefore, accurate localization of discriminative regions remains a major challenge. We describe a patch-based framework to address this problem. We introduce triplets of patches with geometric constraints to improve the accuracy of patch localization, and automatically mine discriminative geometrically-constrained triplets for classification. The resulting approach only requires object bounding boxes. Its effectiveness is demonstrated using four publicly available fine-grained datasets, on which it outperforms or achieves comparable performance to the state-of-the-art in classification.