LGAIJun 4, 2021

SAND-mask: An Enhanced Gradient Masking Strategy for the Discovery of Invariances in Domain Generalization

arXiv:2106.02266v270 citations
Originality Incremental advance
AI Analysis

This addresses the failure of machine learning models to generalize to unseen domains, which is a critical issue for real-world applications, though it appears incremental as it builds on existing gradient-based methods.

The paper tackles the problem of domain generalization by proposing SAND-mask, a gradient masking strategy to discover invariant features across training domains, resulting in significant state-of-the-art accuracy improvements on the Colored MNIST dataset and competitive results on other benchmarks.

A major bottleneck in the real-world applications of machine learning models is their failure in generalizing to unseen domains whose data distribution is not i.i.d to the training domains. This failure often stems from learning non-generalizable features in the training domains that are spuriously correlated with the label of data. To address this shortcoming, there has been a growing surge of interest in learning good explanations that are hard to vary, which is studied under the notion of Out-of-Distribution (OOD) Generalization. The search for good explanations that are \textit{invariant} across different domains can be seen as finding local (global) minimas in the loss landscape that hold true across all of the training domains. In this paper, we propose a masking strategy, which determines a continuous weight based on the agreement of gradients that flow in each edge of network, in order to control the amount of update received by the edge in each step of optimization. Particularly, our proposed technique referred to as "Smoothed-AND (SAND)-masking", not only validates the agreement in the direction of gradients but also promotes the agreement among their magnitudes to further ensure the discovery of invariances across training domains. SAND-mask is validated over the Domainbed benchmark for domain generalization and significantly improves the state-of-the-art accuracy on the Colored MNIST dataset while providing competitive results on other domain generalization datasets.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes