Transductive Unbiased Embedding for Zero-Shot Learning
This addresses the bias issue in ZSL for computer vision applications, offering a significant improvement over existing methods, though it is incremental as it builds on transductive learning approaches.
The paper tackles the bias problem in Zero-Shot Learning (ZSL), where unseen classes are often misclassified as seen classes, by proposing Quasi-Fully Supervised Learning (QFSL), a transductive method that maps images to fixed points in semantic space, resulting in performance gains of 9.3-24.5% in generalized ZSL and 0.2-16.2% in conventional ZSL on datasets like AwA2, CUB, and SUN.
Most existing Zero-Shot Learning (ZSL) methods have the strong bias problem, in which instances of unseen (target) classes tend to be categorized as one of the seen (source) classes. So they yield poor performance after being deployed in the generalized ZSL settings. In this paper, we propose a straightforward yet effective method named Quasi-Fully Supervised Learning (QFSL) to alleviate the bias problem. Our method follows the way of transductive learning, which assumes that both the labeled source images and unlabeled target images are available for training. In the semantic embedding space, the labeled source images are mapped to several fixed points specified by the source categories, and the unlabeled target images are forced to be mapped to other points specified by the target categories. Experiments conducted on AwA2, CUB and SUN datasets demonstrate that our method outperforms existing state-of-the-art approaches by a huge margin of 9.3~24.5% following generalized ZSL settings, and by a large margin of 0.2~16.2% following conventional ZSL settings.