LGMLAug 14, 2020

Binarised Regression with Instance-Varying Costs: Evaluation using Impact Curves

arXiv:2008.07349v1
Originality Synthesis-oriented
AI Analysis

This addresses evaluation challenges in domains like mining and healthcare where misclassification costs vary per instance, but it is incremental as it builds on existing binarised regression methods.

The paper tackles the problem of evaluating binarised regression models with instance-varying costs, proposing impact curves to optimize binary decisions across utilities and quantitatively compare models.

Many evaluation methods exist, each for a particular prediction task, and there are a number of prediction tasks commonly performed including classification and regression. In binarised regression, binary decisions are generated from a learned regression model (or real-valued dependent variable), which is useful when the division between instances that should be predicted positive or negative depends on the utility. For example, in mining, the boundary between a valuable rock and a waste rock depends on the market price of various metals, which varies with time. This paper proposes impact curves to evaluate binarised regression with instance-varying costs, where some instances are much worse to be classified as positive (or negative) than other instances; e.g., it is much worse to throw away a high-grade gold rock than a medium-grade copper-ore rock, even if the mine wishes to keep both because both are profitable. We show how to construct an impact curve for a variety of domains, including examples from healthcare, mining, and entertainment. Impact curves optimize binary decisions across all utilities of the chosen utility function, identify the conditions where one model may be favoured over another, and quantitatively assess improvement between competing models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes