Estimating Accuracy from Unlabeled Data: A Probabilistic Logic Approach
This provides a solution for scenarios where labeled data is scarce, enabling accuracy estimation without ground truth, though it is incremental by building on existing methods with logical constraints.
The paper tackles the problem of estimating classifier accuracy using only unlabeled data by leveraging logical constraints among multiple classification problems, achieving estimates within a few percent of true accuracy in experiments on four real-world datasets.
We propose an efficient method to estimate the accuracy of classifiers using only unlabeled data. We consider a setting with multiple classification problems where the target classes may be tied together through logical constraints. For example, a set of classes may be mutually exclusive, meaning that a data instance can belong to at most one of them. The proposed method is based on the intuition that: (i) when classifiers agree, they are more likely to be correct, and (ii) when the classifiers make a prediction that violates the constraints, at least one classifier must be making an error. Experiments on four real-world data sets produce accuracy estimates within a few percent of the true accuracy, using solely unlabeled data. Our models also outperform existing state-of-the-art solutions in both estimating accuracies, and combining multiple classifier outputs. The results emphasize the utility of logical constraints in estimating accuracy, thus validating our intuition.