Reappraising Domain Generalization in Neural Networks
This work addresses the problem of out-of-distribution generalization for machine learning practitioners, offering incremental insights by challenging existing DG methods and introducing a more challenging benchmark.
The paper investigates Domain Generalization (DG) in neural networks, showing that state-of-the-art results can be achieved using ideas from IID generalization, and proposes a new ClassWise DG benchmark where a novel iterative domain feature masking method attains top performance.
Given that Neural Networks generalize unreasonably well in the IID setting (with benign overfitting and betterment in performance with more parameters), OOD presents a consistent failure case to better the understanding of how they learn. This paper focuses on Domain Generalization (DG), which is perceived as the front face of OOD generalization. We find that the presence of multiple domains incentivizes domain agnostic learning and is the primary reason for generalization in Tradition DG. We show that the state-of-the-art results can be obtained by borrowing ideas from IID generalization and the DG tailored methods fail to add any performance gains. Furthermore, we perform explorations beyond the Traditional DG (TDG) formulation and propose a novel ClassWise DG (CWDG) benchmark, where for each class, we randomly select one of the domains and keep it aside for testing. Despite being exposed to all domains during training, CWDG is more challenging than TDG evaluation. We propose a novel iterative domain feature masking approach, achieving state-of-the-art results on the CWDG benchmark. Overall, while explaining these observations, our work furthers insights into the learning mechanisms of neural networks.