OccamNets: Mitigating Dataset Bias by Favoring Simpler Hypotheses
This addresses generalization issues in deep learning for researchers and practitioners, though it appears incremental as it builds on prior debiasing methods.
The paper tackles the problem of dataset bias and spurious correlations impairing generalization in deep neural networks by proposing OccamNets, which modify the network architecture to favor simpler hypotheses, resulting in performance that outperforms or rivals state-of-the-art methods.
Dataset bias and spurious correlations can significantly impair generalization in deep neural networks. Many prior efforts have addressed this problem using either alternative loss functions or sampling strategies that focus on rare patterns. We propose a new direction: modifying the network architecture to impose inductive biases that make the network robust to dataset bias. Specifically, we propose OccamNets, which are biased to favor simpler solutions by design. OccamNets have two inductive biases. First, they are biased to use as little network depth as needed for an individual example. Second, they are biased toward using fewer image locations for prediction. While OccamNets are biased toward simpler hypotheses, they can learn more complex hypotheses if necessary. In experiments, OccamNets outperform or rival state-of-the-art methods run on architectures that do not incorporate these inductive biases. Furthermore, we demonstrate that when the state-of-the-art debiasing methods are combined with OccamNets results further improve.