Adversarial Autoencoders with Constant-Curvature Latent Manifolds
This work addresses the challenge of embedding data with non-Euclidean properties like hierarchy and circularity for applications in domains such as graph-based learning and chemistry, presenting a unified framework for different curvatures, though it builds incrementally on existing adversarial autoencoder and manifold learning methods.
The paper tackles the problem of representing data distributions on constant-curvature Riemannian manifolds (CCMs) by introducing the CCM adversarial autoencoder (CCM-AAE), a probabilistic generative model that matches the aggregated posterior with a CCM distribution and imposes geometric constraints, resulting in improved performance over other autoencoders on tasks like semi-supervised classification, link prediction, and molecule generation.
Constant-curvature Riemannian manifolds (CCMs) have been shown to be ideal embedding spaces in many application domains, as their non-Euclidean geometry can naturally account for some relevant properties of data, like hierarchy and circularity. In this work, we introduce the CCM adversarial autoencoder (CCM-AAE), a probabilistic generative model trained to represent a data distribution on a CCM. Our method works by matching the aggregated posterior of the CCM-AAE with a probability distribution defined on a CCM, so that the encoder implicitly learns to represent data on the CCM to fool the discriminator network. The geometric constraint is also explicitly imposed by jointly training the CCM-AAE to maximise the membership degree of the embeddings to the CCM. While a few works in recent literature make use of either hyperspherical or hyperbolic manifolds for different learning tasks, ours is the first unified framework to seamlessly deal with CCMs of different curvatures. We show the effectiveness of our model on three different datasets characterised by non-trivial geometry: semi-supervised classification on MNIST, link prediction on two popular citation datasets, and graph-based molecule generation using the QM9 chemical database. Results show that our method improves upon other autoencoders based on Euclidean and non-Euclidean geometries on all tasks taken into account.