LGAICVIVMar 6, 2023

Towards Composable Distributions of Latent Space Augmentations

arXiv:2303.03462v1h-index: 39
AI Analysis

This work addresses the need for greater control and interpretability in image augmentation for researchers and practitioners, though it appears incremental as it builds on existing VAE methods with a novel composability approach.

The paper tackles the problem of combining multiple image augmentations by proposing a composable latent space augmentation framework based on Variational Autoencoders, which allows for easy combination and inversion of augmentations and demonstrates initial effectiveness on the MNIST dataset compared to standard and Conditional VAEs.

We propose a composable framework for latent space image augmentation that allows for easy combination of multiple augmentations. Image augmentation has been shown to be an effective technique for improving the performance of a wide variety of image classification and generation tasks. Our framework is based on the Variational Autoencoder architecture and uses a novel approach for augmentation via linear transformation within the latent space itself. We explore losses and augmentation latent geometry to enforce the transformations to be composable and involuntary, thus allowing the transformations to be readily combined or inverted. Finally, we show these properties are better performing with certain pairs of augmentations, but we can transfer the latent space to other sets of augmentations to modify performance, effectively constraining the VAE's bottleneck to preserve the variance of specific augmentations and features of the image which we care about. We demonstrate the effectiveness of our approach with initial results on the MNIST dataset against both a standard VAE and a Conditional VAE. This latent augmentation method allows for much greater control and geometric interpretability of the latent space, making it a valuable tool for researchers and practitioners in the field.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes