LGMLJul 25, 2021

Invariance-based Multi-Clustering of Latent Space Embeddings for Equivariant Learning

arXiv:2107.11717v1
Originality Incremental advance
AI Analysis

This addresses a bottleneck in variational autoencoders for computer vision tasks, though it appears incremental as it builds on existing VAE frameworks.

The paper tackles the problem of VAEs failing to learn invariant and equivariant clusters in latent space by proposing a method that disentangles equivariance feature maps using group-invariant learning and separates semantic and equivariant variables with a modified ELBO. The result shows significant improvements in learning rate and superior image recognition and reconstruction compared to state-of-the-art models.

Variational Autoencoders (VAEs) have been shown to be remarkably effective in recovering model latent spaces for several computer vision tasks. However, currently trained VAEs, for a number of reasons, seem to fall short in learning invariant and equivariant clusters in latent space. Our work focuses on providing solutions to this problem and presents an approach to disentangle equivariance feature maps in a Lie group manifold by enforcing deep, group-invariant learning. Simultaneously implementing a novel separation of semantic and equivariant variables of the latent space representation, we formulate a modified Evidence Lower BOund (ELBO) by using a mixture model pdf like Gaussian mixtures for invariant cluster embeddings that allows superior unsupervised variational clustering. Our experiments show that this model effectively learns to disentangle the invariant and equivariant representations with significant improvements in the learning rate and an observably superior image recognition and canonical state reconstruction compared to the currently best deep learning models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes