Self-informed neural network structure learning
This addresses the challenge of improving already high-performing visual recognition models efficiently for applications like object detection.
The paper tackles the problem of large-scale multi-label visual recognition by augmenting a trained neural network with auxiliary pathways that connect to groups of visually similar labels, achieving significant improvement in mean average precision while increasing computational cost by less than 3%.
We study the problem of large scale, multi-label visual recognition with a large number of possible classes. We propose a method for augmenting a trained neural network classifier with auxiliary capacity in a manner designed to significantly improve upon an already well-performing model, while minimally impacting its computational footprint. Using the predictions of the network itself as a descriptor for assessing visual similarity, we define a partitioning of the label space into groups of visually similar entities. We then augment the network with auxilliary hidden layer pathways with connectivity only to these groups of label units. We report a significant improvement in mean average precision on a large-scale object recognition task with the augmented model, while increasing the number of multiply-adds by less than 3%.