IMAIDec 24, 2020

Self-Supervised Representation Learning for Astronomical Images

arXiv:2012.13083v256 citations
AI Analysis

This work addresses the challenge of extracting scientific information from large astronomical sky surveys for astronomers, particularly when labels are scarce.

This paper demonstrates that self-supervised learning can effectively extract semantically useful representations from astronomical images without labels. These representations, learned from multi-band galaxy photometry, outperform supervised state-of-the-art methods in galaxy morphology classification and photometric redshift estimation, achieving comparable accuracy with 2-4 times fewer labels.

Sky surveys are the largest data generators in astronomy, making automated tools for extracting meaningful scientific information an absolute necessity. We show that, without the need for labels, self-supervised learning recovers representations of sky survey images that are semantically useful for a variety of scientific tasks. These representations can be directly used as features, or fine-tuned, to outperform supervised methods trained only on labeled data. We apply a contrastive learning framework on multi-band galaxy photometry from the Sloan Digital Sky Survey (SDSS) to learn image representations. We then use them for galaxy morphology classification, and fine-tune them for photometric redshift estimation, using labels from the Galaxy Zoo 2 dataset and SDSS spectroscopy. In both downstream tasks, using the same learned representations, we outperform the supervised state-of-the-art results, and we show that our approach can achieve the accuracy of supervised models while using 2-4 times fewer labels for training.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes