Semi-Supervised Learning of Class Balance under Class-Prior Change by Distribution Matching
This addresses bias correction in semi-supervised learning for real-world applications where test class ratios are unknown, but it appears incremental as it builds on existing distribution matching techniques.
The paper tackles the problem of class-prior change between training and test datasets in classification, which causes estimation bias, by proposing a method to estimate the test class ratio through distribution matching, and demonstrates its utility in experiments.
In real-world classification problems, the class balance in the training dataset does not necessarily reflect that of the test dataset, which can cause significant estimation bias. If the class ratio of the test dataset is known, instance re-weighting or resampling allows systematical bias correction. However, learning the class ratio of the test dataset is challenging when no labeled data is available from the test domain. In this paper, we propose to estimate the class ratio in the test dataset by matching probability distributions of training and test input data. We demonstrate the utility of the proposed approach through experiments.