Sparse joint shift in multinomial classification
This work provides incremental theoretical insights into dataset shift for researchers in machine learning, focusing on improving prediction accuracy under distribution changes.
The paper addresses the problem of dataset shift in multinomial classification by analyzing the sparse joint shift (SJS) model, presenting new theoretical results on its transmission, identifiability, and relationship to covariate shift, while pointing out inconsistencies in existing estimation algorithms.
Sparse joint shift (SJS) was recently proposed as a tractable model for general dataset shift which may cause changes to the marginal distributions of features and labels as well as the posterior probabilities and the class-conditional feature distributions. Fitting SJS for a target dataset without label observations may produce valid predictions of labels and estimates of class prior probabilities. We present new results on the transmission of SJS from sets of features to larger sets of features, a conditional correction formula for the class posterior probabilities under the target distribution, identifiability of SJS, and the relationship between SJS and covariate shift. In addition, we point out inconsistencies in the algorithms which were proposed for estimating the characteristics of SJS, as they could hamper the search for optimal solutions, and suggest potential improvements.