SuperDisco: Super-Class Discovery Improves Visual Recognition for the Long-Tail
This addresses the long-tailed recognition challenge in computer vision, which is crucial for real-world applications where data is imbalanced, but it appears incremental as it builds on existing graph-based methods.
The paper tackles the problem of poor performance on tail classes in long-tailed image recognition by proposing SuperDisco, an algorithm that discovers super-class representations using a graph model, resulting in state-of-the-art performance on benchmarks like CIFAR-100 and ImageNet.
Modern image classifiers perform well on populated classes, while degrading considerably on tail classes with only a few instances. Humans, by contrast, effortlessly handle the long-tailed recognition challenge, since they can learn the tail representation based on different levels of semantic abstraction, making the learned tail features more discriminative. This phenomenon motivated us to propose SuperDisco, an algorithm that discovers super-class representations for long-tailed recognition using a graph model. We learn to construct the super-class graph to guide the representation learning to deal with long-tailed distributions. Through message passing on the super-class graph, image representations are rectified and refined by attending to the most relevant entities based on the semantic similarity among their super-classes. Moreover, we propose to meta-learn the super-class graph under the supervision of a prototype graph constructed from a small amount of imbalanced data. By doing so, we obtain a more robust super-class graph that further improves the long-tailed recognition performance. The consistent state-of-the-art experiments on the long-tailed CIFAR-100, ImageNet, Places and iNaturalist demonstrate the benefit of the discovered super-class graph for dealing with long-tailed distributions.