Variable Length Embeddings
This work addresses the need for flexible latent representations in deep learning, but it is incremental as it builds on existing autoregressive and VAE methods.
The authors tackled the problem of generating variable-length latent representations with an autoregressive model, achieving comparable reconstruction results to a state-of-the-art VAE while using less than a tenth of the parameters.
In this work, we introduce a novel deep learning architecture, Variable Length Embeddings (VLEs), an autoregressive model that can produce a latent representation composed of an arbitrary number of tokens. As a proof of concept, we demonstrate the capabilities of VLEs on tasks that involve reconstruction and image decomposition. We evaluate our experiments on a mix of the iNaturalist and ImageNet datasets and find that VLEs achieve comparable reconstruction results to a state of the art VAE, using less than a tenth of the parameters.