Re-envisioning Euclid Galaxy Morphology: Identifying and Interpreting Features with Sparse Autoencoders
This work addresses the challenge of discovering astrophysical phenomena beyond human-defined classifications for astronomers, though it is incremental in applying existing SAE methods to new data.
The paper tackled the problem of identifying interpretable features in galaxy morphology from Euclid Q1 images using Sparse Autoencoders (SAEs), achieving superhuman image reconstruction performance with a publicly released MAE model and stronger alignment with Galaxy Zoo labels than PCA.
Sparse Autoencoders (SAEs) can efficiently identify candidate monosemantic features from pretrained neural networks for galaxy morphology. We demonstrate this on Euclid Q1 images using both supervised (Zoobot) and new self-supervised (MAE) models. Our publicly released MAE achieves superhuman image reconstruction performance. While a Principal Component Analysis (PCA) on the supervised model primarily identifies features already aligned with the Galaxy Zoo decision tree, SAEs can identify interpretable features outside of this framework. SAE features also show stronger alignment than PCA with Galaxy Zoo labels. Although challenges in interpretability remain, SAEs provide a powerful engine for discovering astrophysical phenomena beyond the confines of human-defined classification.