Adversarial Disentanglement with Grouped Observations
This work addresses a specific issue in unsupervised disentanglement for image datasets, offering an incremental improvement over existing methods.
The paper tackles the problem of disentangling content and style attributes in grouped observations using Variational Autoencoders, by introducing an adversarial method to minimize mutual information between style representations and content features, resulting in efficient separation and generalization to unseen data.
We consider the disentanglement of the representations of the relevant attributes of the data (content) from all other factors of variations (style) using Variational Autoencoders. Some recent works addressed this problem by utilizing grouped observations, where the content attributes are assumed to be common within each group, while there is no any supervised information on the style factors. In many cases, however, these methods fail to prevent the models from using the style variables to encode content related features as well. This work supplements these algorithms with a method that eliminates the content information in the style representations. For that purpose the training objective is augmented to minimize an appropriately defined mutual information term in an adversarial way. Experimental results and comparisons on image datasets show that the resulting method can efficiently separate the content and style related attributes and generalizes to unseen data.