GroupEnc: encoder with group loss for global structure preservation
This work addresses the need for more accurate embeddings in dimensionality reduction for downstream biological tasks like clustering and trajectory inference, representing an incremental improvement over existing methods.
The authors tackled the problem of global structure distortion in variational autoencoder embeddings by introducing GroupEnc, which uses a group loss function to better preserve global structure while maintaining flexibility, achieving improved performance on single-cell transcriptomic datasets as measured by RNX curves.
Recent advances in dimensionality reduction have achieved more accurate lower-dimensional embeddings of high-dimensional data. In addition to visualisation purposes, these embeddings can be used for downstream processing, including batch effect normalisation, clustering, community detection or trajectory inference. We use the notion of structure preservation at both local and global levels to create a deep learning model, based on a variational autoencoder (VAE) and the stochastic quartet loss from the SQuadMDS algorithm. Our encoder model, called GroupEnc, uses a 'group loss' function to create embeddings with less global structure distortion than VAEs do, while keeping the model parametric and the architecture flexible. We validate our approach using publicly available biological single-cell transcriptomic datasets, employing RNX curves for evaluation.