MLLGDec 8, 2019

Contrast Trees and Distribution Boosting

arXiv:1912.03785v119 citations
Originality Highly original
AI Analysis

This addresses the need for veracity in ML outputs for decision-making, offering a novel approach for accuracy assessment in non-standard cases.

The paper tackles the problem of assessing the accuracy of machine learning estimates when standard validation methods are not applicable, introducing contrast trees to detect inaccuracies and using boosted contrast trees to improve performance, with distribution boosting enabling assumption-free estimation of full probability distributions.

Often machine learning methods are applied and results reported in cases where there is little to no information concerning accuracy of the output. Simply because a computer program returns a result does not insure its validity. If decisions are to be made based on such results it is important to have some notion of their veracity. Contrast trees represent a new approach for assessing the accuracy of many types of machine learning estimates that are not amenable to standard (cross) validation methods. In situations where inaccuracies are detected boosted contrast trees can often improve performance. A special case, distribution boosting, provides an assumption free method for estimating the full probability distribution of an outcome variable given any set of joint input predictor variable values.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes