Learning with Hierarchical Complement Objective
This addresses the issue of ignoring label dependencies in classification and segmentation for computer vision researchers, though it is incremental as it builds on existing loss functions.
The paper tackles the problem of label hierarchies in vision tasks by proposing Hierarchical Complement Objective Training (HCOT), which explicitly uses hierarchy information to improve model performance, achieving state-of-the-art results on datasets like CIFAR-100, ImageNet-2012, and PASCAL-Context.
Label hierarchies widely exist in many vision-related problems, ranging from explicit label hierarchies existed in image classification to latent label hierarchies existed in semantic segmentation. Nevertheless, state-of-the-art methods often deploy cross-entropy loss that implicitly assumes class labels to be exclusive and thus independence from each other. Motivated by the fact that classes from the same parental category usually share certain similarity, we design a new training diagram called Hierarchical Complement Objective Training (HCOT) that leverages the information from label hierarchy. HCOT maximizes the probability of the ground truth class, and at the same time, neutralizes the probabilities of rest of the classes in a hierarchical fashion, making the model take advantage of the label hierarchy explicitly. The proposed HCOT is evaluated on both image classification and semantic segmentation tasks. Experimental results confirm that HCOT outperforms state-of-the-art models in CIFAR-100, ImageNet-2012, and PASCAL-Context. The study further demonstrates that HCOT can be applied on tasks with latent label hierarchies, which is a common characteristic in many machine learning tasks.