Confidence intervals for class prevalences under prior probability shift
This work addresses methodological gaps in uncertainty quantification for class prevalence estimation under data shift, which is incremental as it builds on existing point estimation research.
The paper tackles the problem of constructing confidence and prediction intervals for class prevalence estimates under prior probability shift, addressing questions about their practical distinction and the impact of classifier discriminatory power, through a simulation study.
Point estimation of class prevalences in the presence of data set shift has been a popular research topic for more than two decades. Less attention has been paid to the construction of confidence and prediction intervals for estimates of class prevalences. One little considered question is whether or not it is necessary for practical purposes to distinguish confidence and prediction intervals. Another question so far not yet conclusively answered is whether or not the discriminatory power of the classifier or score at the basis of an estimation method matters for the accuracy of the estimates of the class prevalences. This paper presents a simulation study aimed at shedding some light on these and other related questions.