Improving One-Shot Learning through Fusing Side Information
This addresses the challenge of limited labeled data for machine learning practitioners, but it is incremental as it builds on existing attentional regression networks.
The paper tackles the problem of one-shot learning in deep neural networks by fusing side information into data representation learning, resulting in improved performance over traditional and state-of-the-art methods on one-shot recognition tasks.
Deep Neural Networks (DNNs) often struggle with one-shot learning where we have only one or a few labeled training examples per category. In this paper, we argue that by using side information, we may compensate the missing information across classes. We introduce two statistical approaches for fusing side information into data representation learning to improve one-shot learning. First, we propose to enforce the statistical dependency between data representations and multiple types of side information. Second, we introduce an attention mechanism to efficiently treat examples belonging to the 'lots-of-examples' classes as quasi-samples (additional training samples) for 'one-example' classes. We empirically show that our learning architecture improves over traditional softmax regression networks as well as state-of-the-art attentional regression networks on one-shot recognition tasks.