Revisiting Factorizing Aggregated Posterior in Learning Disentangled Representations
This work addresses a theoretical blind spot in disentangled representation learning, which is incremental but important for improving existing methods.
The paper identifies a disparity between sampled and mean latent representations in disentangled learning, showing that low total correlation in sampled representations does not guarantee it in mean representations, and proposes a method to reduce this disparity, achieving lower total correlation in mean representations.
In the problem of learning disentangled representations, one of the promising methods is to factorize aggregated posterior by penalizing the total correlation of sampled latent variables. However, this well-motivated strategy has a blind spot: there is a disparity between the sampled latent representation and its corresponding mean representation. In this paper, we provide a theoretical explanation that low total correlation of sampled representation cannot guarantee low total correlation of the mean representation. Indeed, we prove that for the multivariate normal distributions, the mean representation with arbitrarily high total correlation can have a corresponding sampled representation with bounded total correlation. We also propose a method to eliminate this disparity. Experiments show that our model can learn a mean representation with much lower total correlation, hence a factorized mean representation. Moreover, we offer a detailed explanation of the limitations of factorizing aggregated posterior: factor disintegration. Our work indicates a potential direction for future research of disentangled learning.