Affinity Graph Supervision for Visual Recognition
This work addresses a gap in deep learning for visual recognition by enabling better exploitation of data relationships, though it is incremental as it builds on existing graph and attention methods.
The paper tackles the problem of learning affinity graphs in deep architectures by proposing a method to directly supervise affinity weights, which improves object relationship recovery and scene categorization without manual relationship labels, and shows consistent improvements in image classification across various architectures and datasets.
Affinity graphs are widely used in deep architectures, including graph convolutional neural networks and attention networks. Thus far, the literature has focused on abstracting features from such graphs, while the learning of the affinities themselves has been overlooked. Here we propose a principled method to directly supervise the learning of weights in affinity graphs, to exploit meaningful connections between entities in the data source. Applied to a visual attention network, our affinity supervision improves relationship recovery between objects, even without the use of manually annotated relationship labels. We further show that affinity learning between objects boosts scene categorization performance and that the supervision of affinity can also be applied to graphs built from mini-batches, for neural network training. In an image classification task we demonstrate consistent improvement over the baseline, with diverse network architectures and datasets.