CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process
This addresses a limitation in causal representation learning for real-world applications like visual perception, where invertibility assumptions are often violated, though it appears incremental as it builds on prior methods by relaxing these assumptions.
The paper tackles the problem of identifying time-delayed latent causal processes in sequential data when the generation process is non-invertible, proposing CaRiNG with identifiability guarantees and showing it reliably identifies causal processes and improves temporal understanding in experiments.
Identifying the underlying time-delayed latent causal processes in sequential data is vital for grasping temporal dynamics and making downstream reasoning. While some recent methods can robustly identify these latent causal variables, they rely on strict assumptions about the invertible generation process from latent variables to observed data. However, these assumptions are often hard to satisfy in real-world applications containing information loss. For instance, the visual perception process translates a 3D space into 2D images, or the phenomenon of persistence of vision incorporates historical data into current perceptions. To address this challenge, we establish an identifiability theory that allows for the recovery of independent latent components even when they come from a nonlinear and non-invertible mix. Using this theory as a foundation, we propose a principled approach, CaRiNG, to learn the CAusal RepresentatIon of Non-invertible Generative temporal data with identifiability guarantees. Specifically, we utilize temporal context to recover lost latent information and apply the conditions in our theory to guide the training process. Through experiments conducted on synthetic datasets, we validate that our CaRiNG method reliably identifies the causal process, even when the generation process is non-invertible. Moreover, we demonstrate that our approach considerably improves temporal understanding and reasoning in practical applications.