LGMLNov 20, 2018

Contingency Training

arXiv:1811.08214v11 citations
Originality Incremental advance
AI Analysis

This addresses the issue of residual irrelevant variables in high-dimensional data for machine learning practitioners, offering an incremental improvement in classifier robustness.

The paper tackles the problem of classifiers being affected by irrelevant variables even after feature selection, and introduces Contingency Training, a classifier-independent method that improves accuracy and robustness by subsampling and removing information to assign feature importance weights. Experiments show it outperforms unmodified training on datasets with irrelevant variables and slightly on those without.

When applied to high-dimensional datasets, feature selection algorithms might still leave dozens of irrelevant variables in the dataset. Therefore, even after feature selection has been applied, classifiers must be prepared to the presence of irrelevant variables. This paper investigates a new training method called Contingency Training which increases the accuracy as well as the robustness against irrelevant attributes. Contingency training is classifier independent. By subsampling and removing information from each sample, it creates a set of constraints. These constraints aid the method to automatically find proper importance weights of the dataset's features. Experiments are conducted with the contingency training applied to neural networks over traditional datasets as well as datasets with additional irrelevant variables. For all of the tests, contingency training surpassed the unmodified training on datasets with irrelevant variables and even outperformed slightly when only a few or no irrelevant variables were present.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes