The Classification of Optical Galaxy Morphology Using Unsupervised Learning Techniques
This work addresses the challenge of automating galaxy classification for astronomers overwhelmed by large-scale data, but it is incremental as it builds on existing unsupervised methods without major breakthroughs.
The paper tackles the problem of classifying galaxy morphology from astronomical survey images by implementing unsupervised learning techniques, specifically using a convolutional autoencoder for feature extraction followed by clustering methods like k-means and agglomerative clustering, with results compared to volunteer classifications from the Galaxy Zoo DECaLS dataset, though performance gains over simpler methods were not significant.
In recent years, large scale data intensive astronomical surveys have resulted in more detailed images being produced than scientists can manually classify. Even attempts to crowd-source this work will soon be outpaced by the large amount of data generated by modern surveys. This has brought into question the viability of human-based methods for classifying galaxy morphology. While supervised learning methods require datasets with existing labels, unsupervised learning techniques do not. Therefore, this paper implements unsupervised learning techniques to classify the Galaxy Zoo DECaLS dataset. A convolutional autoencoder feature extractor was trained and implemented. The resulting features were then clustered via k-means, fuzzy c-means and agglomerative clustering. These clusters were compared against the true volunteer classifications provided by the Galaxy Zoo DECaLS project. The best results, in general, were produced by the agglomerate clustering method. However, the increase in performance compared to k-means clustering was not significant considering the increase in clustering time. After undergoing the appropriate clustering algorithm optimizations, this approach could prove useful for classifying the better performing questions and could serve as the basis for a novel approach to generating more "human-like" galaxy morphology classifications from unsupervised techniques.