Counterfactuals uncover the modular structure of deep generative models
This work addresses the problem of interpretable and controllable transformations in generative models for researchers and practitioners in AI, offering a novel approach that is incremental by building on prior disentanglement methods.
The paper tackled the challenge of manipulating latent representations in deep generative models for controllable data transformations by proposing a non-statistical framework using counterfactual manipulations to uncover modular, disentangled groups of internal variables. Experiments on complex image datasets demonstrated that these modules enable targeted interventions, leading to applications like efficient style transfer and automated robustness assessment.
Deep generative models can emulate the perceptual properties of complex image datasets, providing a latent representation of the data. However, manipulating such representation to perform meaningful and controllable transformations in the data space remains challenging without some form of supervision. While previous work has focused on exploiting statistical independence to disentangle latent factors, we argue that such requirement is too restrictive and propose instead a non-statistical framework that relies on counterfactual manipulations to uncover a modular structure of the network composed of disentangled groups of internal variables. Experiments with a variety of generative models trained on complex image datasets show the obtained modules can be used to design targeted interventions. This opens the way to applications such as computationally efficient style transfer and the automated assessment of robustness to contextual changes in pattern recognition systems.