LGJun 24, 2016

Satisfying Real-world Goals with Dataset Constraints

arXiv:1606.07558v2216 citations
Originality Incremental advance
AI Analysis

This addresses the need for machine learning models to meet diverse practical constraints beyond standard accuracy, which is important for real-world applications in areas like fairness and deployment.

The paper tackles the problem of satisfying multiple real-world goals (like fairness requirements, recall targets, or deployment stability) defined on different datasets, by proposing training with dataset constraints using ramp penalties and an efficient optimization algorithm. Experiments on benchmark and industry datasets demonstrate the approach's effectiveness.

The goal of minimizing misclassification error on a training set is often just one of several real-world goals that might be defined on different datasets. For example, one may require a classifier to also make positive predictions at some specified rate for some subpopulation (fairness), or to achieve a specified empirical recall. Other real-world goals include reducing churn with respect to a previously deployed model, or stabilizing online training. In this paper we propose handling multiple goals on multiple datasets by training with dataset constraints, using the ramp penalty to accurately quantify costs, and present an efficient algorithm to approximately optimize the resulting non-convex constrained optimization problem. Experiments on both benchmark and real-world industry datasets demonstrate the effectiveness of our approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes