LG AIAug 7, 2022

Learning Modular Structures That Generalize Out-of-Distribution

Arjun Ashok, Chaitanya Devaguptapu, Vineeth Balasubramanian

arXiv:2208.03753v13.31 citationsh-index: 37

Originality Synthesis-oriented

AI Analysis

This addresses a key challenge for real-world ML systems, but it appears incremental as it builds on existing regularization and masking techniques.

The paper tackles out-of-distribution generalization by training models to preserve reusable features across domains, resulting in a modular sub-network that improves O.O.D. performance on two benchmark datasets.

Out-of-distribution (O.O.D.) generalization remains to be a key challenge for real-world machine learning systems. We describe a method for O.O.D. generalization that, through training, encourages models to only preserve features in the network that are well reused across multiple training domains. Our method combines two complementary neuron-level regularizers with a probabilistic differentiable binary mask over the network, to extract a modular sub-network that achieves better O.O.D. performance than the original network. Preliminary evaluation on two benchmark datasets corroborates the promise of our method.

View on arXiv PDF

Similar