LG MLJun 2, 2022

Weakly Supervised Representation Learning with Sparse Perturbations

Kartik Ahuja, Jason Hartford, Yoshua Bengio

arXiv:2206.01101v126.977 citationsh-index: 212Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of unsupervised representation learning for AI systems by reducing reliance on strong distributional assumptions, offering a method that could improve robustness in domains like reinforcement learning, though it is incremental as it builds on prior weak supervision approaches.

The paper tackles the problem of provably identifying latent variables in representation learning with minimal supervision, showing that sparse perturbations of latents, such as actions in reinforcement learning environments, enable identification under unknown continuous distributions, with results demonstrating identification up to blocks or individual latents depending on perturbation overlap.

The theory of representation learning aims to build methods that provably invert the data generating process with minimal domain knowledge or any source of supervision. Most prior approaches require strong distributional assumptions on the latent variables and weak supervision (auxiliary information such as timestamps) to provide provable identification guarantees. In this work, we show that if one has weak supervision from observations generated by sparse perturbations of the latent variables--e.g. images in a reinforcement learning environment where actions move individual sprites--identification is achievable under unknown continuous latent distributions. We show that if the perturbations are applied only on mutually exclusive blocks of latents, we identify the latents up to those blocks. We also show that if these perturbation blocks overlap, we identify latents up to the smallest blocks shared across perturbations. Consequently, if there are blocks that intersect in one latent variable only, then such latents are identified up to permutation and scaling. We propose a natural estimation procedure based on this theory and illustrate it on low-dimensional synthetic and image-based experiments.

View on arXiv PDF Code

Similar