CVNov 22, 2019

Orderless Recurrent Models for Multi-label Classification

Vacit Oguz Yazici, Abel Gonzalez-Garcia, Arnau Ramisa, Bartlomiej Twardowski, Joost van de Weijer

arXiv:1911.09996v317.3108 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the need for more efficient and accurate multi-label classification in computer vision, though it is incremental as it builds on existing CNN-RNN models.

The paper tackles the problem of label ordering in recurrent neural networks for multi-label classification by proposing a method to dynamically order ground truth labels with predicted sequences, resulting in state-of-the-art performance on datasets like MS-COCO and competitive results on others.

Recurrent neural networks (RNN) are popular for many computer vision tasks, including multi-label classification. Since RNNs produce sequential outputs, labels need to be ordered for the multi-label classification task. Current approaches sort labels according to their frequency, typically ordering them in either rare-first or frequent-first. These imposed orderings do not take into account that the natural order to generate the labels can change for each image, e.g.\ first the dominant object before summing up the smaller objects in the image. Therefore, in this paper, we propose ways to dynamically order the ground truth labels with the predicted label sequence. This allows for the faster training of more optimal LSTM models for multi-label classification. Analysis evidences that our method does not suffer from duplicate generation, something which is common for other models. Furthermore, it outperforms other CNN-RNN models, and we show that a standard architecture of an image encoder and language decoder trained with our proposed loss obtains the state-of-the-art results on the challenging MS-COCO, WIDER Attribute and PA-100K and competitive results on NUS-WIDE.

View on arXiv PDF Code

Similar