Identifying through Flows for Recovering Latent Representations
This addresses the fundamental issue of identifiability in representation learning for researchers and practitioners in machine learning, offering a novel approach to improve latent recovery, though it is incremental as it builds on prior work like iVAE.
The paper tackles the problem of recovering true latent representations in deep generative models by proposing an identifiable flow-based model (iFlow) that directly maximizes marginal likelihood, dispensing with variational approximations and providing theoretical guarantees on identifiability, with simulations on synthetic data validating its correctness and effectiveness over existing methods.
Identifiability, or recovery of the true latent representations from which the observed data originates, is de facto a fundamental goal of representation learning. Yet, most deep generative models do not address the question of identifiability, and thus fail to deliver on the promise of the recovery of the true latent sources that generate the observations. Recent work proposed identifiable generative modelling using variational autoencoders (iVAE) with a theory of identifiability. Due to the intractablity of KL divergence between variational approximate posterior and the true posterior, however, iVAE has to maximize the evidence lower bound (ELBO) of the marginal likelihood, leading to suboptimal solutions in both theory and practice. In contrast, we propose an identifiable framework for estimating latent representations using a flow-based model (iFlow). Our approach directly maximizes the marginal likelihood, allowing for theoretical guarantees on identifiability, thereby dispensing with variational approximations. We derive its optimization objective in analytical form, making it possible to train iFlow in an end-to-end manner. Simulations on synthetic data validate the correctness and effectiveness of our proposed method and demonstrate its practical advantages over other existing methods.