ML LG STJul 17, 2021

Minimising quantifier variance under prior probability shift

arXiv:2107.08209v410 citations

Originality Incremental advance

AI Analysis

This work addresses a specific statistical estimation issue in machine learning, offering incremental improvements for researchers dealing with quantification tasks under distribution shifts.

The paper tackles the problem of binary prevalence quantification under prior probability shift by analyzing the asymptotic variance of the maximum likelihood estimator, finding it depends on the Brier score of the classifier on test data, and suggests training criteria to optimize this score to reduce variance.

For the binary prevalence quantification problem under prior probability shift, we determine the asymptotic variance of the maximum likelihood estimator. We find that it is a function of the Brier score for the regression of the class label on the features under the test data set distribution. This observation suggests that optimising the accuracy of a base classifier, as measured by the Brier score, on the training data set helps to reduce the variance of the related quantifier on the test data set. Therefore, we also point out training criteria for the base classifier that imply optimisation of both of the Brier scores on the training and the test data sets.

View on arXiv PDF

Similar