LG CVJun 15, 2022

Differentiable Top-k Classification Learning

Felix Petersen, Hilde Kuehne, Christian Borgelt, Oliver Deussen

IBMMIT

arXiv:2206.07290v117.349 citationsh-index: 54Has Code

Originality Incremental advance

AI Analysis

This work addresses a core metric in machine learning by enhancing classification performance, particularly for state-of-the-art models, though it is incremental as it builds on existing differentiable sorting and ranking techniques.

The paper tackles the problem of optimizing top-k classification accuracy by relaxing the integer constraint on k and proposing a differentiable top-k cross-entropy loss that considers multiple k values simultaneously, resulting in improved top-1 and top-5 accuracies and achieving a new state-of-the-art when fine-tuning ImageNet models.

The top-k classification accuracy is one of the core metrics in machine learning. Here, k is conventionally a positive integer, such as 1 or 5, leading to top-1 or top-5 training objectives. In this work, we relax this assumption and optimize the model for multiple k simultaneously instead of using a single k. Leveraging recent advances in differentiable sorting and ranking, we propose a differentiable top-k cross-entropy classification loss. This allows training the network while not only considering the top-1 prediction, but also, e.g., the top-2 and top-5 predictions. We evaluate the proposed loss function for fine-tuning on state-of-the-art architectures, as well as for training from scratch. We find that relaxing k does not only produce better top-5 accuracies, but also leads to top-1 accuracy improvements. When fine-tuning publicly available ImageNet models, we achieve a new state-of-the-art for these models.

View on arXiv PDF Code

Similar