LGCVApr 9, 2022

The Two Dimensions of Worst-case Training and the Integrated Effect for Out-of-domain Generalization

CMU
arXiv:2204.04384v131 citationsh-index: 46
Originality Incremental advance
AI Analysis

This addresses robustness in machine learning for scenarios requiring generalization across distributions, but it appears incremental as it builds on existing hard-to-learn concepts.

The paper tackles the problem of improving out-of-domain generalization by merging sample and feature dimensions of worst-case training, resulting in a new heuristic method called W2D that demonstrates empirical strength over standard benchmarks.

Training with an emphasis on "hard-to-learn" components of the data has been proven as an effective method to improve the generalization of machine learning models, especially in the settings where robustness (e.g., generalization across distributions) is valued. Existing literature discussing this "hard-to-learn" concept are mainly expanded either along the dimension of the samples or the dimension of the features. In this paper, we aim to introduce a simple view merging these two dimensions, leading to a new, simple yet effective, heuristic to train machine learning models by emphasizing the worst-cases on both the sample and the feature dimensions. We name our method W2D following the concept of "Worst-case along Two Dimensions". We validate the idea and demonstrate its empirical strength over standard benchmarks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes