ML LG APFeb 21, 2020

Adaptive Covariate Acquisition for Minimizing Total Cost of Classification

arXiv:2002.09162v12.71 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses cost-sensitive classification in domains like medicine where feature acquisition is expensive, offering an incremental improvement over existing methods.

The paper tackles the problem of minimizing total classification costs, including covariate acquisition and misclassification costs, by formalizing it with conditional Bayes risk and proposing a computationally efficient solution under specific assumptions. The method achieves the lowest total costs on several medical datasets compared to previous methods and meets target recall while minimizing false discovery rate and acquisition costs.

In some applications, acquiring covariates comes at a cost which is not negligible. For example in the medical domain, in order to classify whether a patient has diabetes or not, measuring glucose tolerance can be expensive. Assuming that the cost of each covariate, and the cost of misclassification can be specified by the user, our goal is to minimize the (expected) total cost of classification, i.e. the cost of misclassification plus the cost of the acquired covariates. We formalize this optimization goal using the (conditional) Bayes risk and describe the optimal solution using a recursive procedure. Since the procedure is computationally infeasible, we consequently introduce two assumptions: (1) the optimal classifier can be represented by a generalized additive model, (2) the optimal sets of covariates are limited to a sequence of sets of increasing size. We show that under these two assumptions, a computationally efficient solution exists. Furthermore, on several medical datasets, we show that the proposed method achieves in most situations the lowest total costs when compared to various previous methods. Finally, we weaken the requirement on the user to specify all misclassification costs by allowing the user to specify the minimally acceptable recall (target recall). Our experiments confirm that the proposed method achieves the target recall while minimizing the false discovery rate and the covariate acquisition costs better than previous methods.

View on arXiv PDF Code

Similar