GACVMay 17, 2019

Galaxy Zoo: Probabilistic Morphology through Bayesian CNNs and Active Learning

arXiv:1905.07424v2100 citations
Originality Incremental advance
AI Analysis

This enables scalable classification of galaxy surveys for astronomy research, though it is incremental as it builds on existing Bayesian and active learning methods.

The authors tackled the problem of classifying galaxy morphologies from images with uncertain labels by using Bayesian CNNs and a generative model to infer probabilistic labels, achieving well-calibrated posteriors (e.g., 11.8% coverage error for bars) and reducing labeled data needs by 35-60% through active learning.

We use Bayesian convolutional neural networks and a novel generative model of Galaxy Zoo volunteer responses to infer posteriors for the visual morphology of galaxies. Bayesian CNN can learn from galaxy images with uncertain labels and then, for previously unlabelled galaxies, predict the probability of each possible label. Our posteriors are well-calibrated (e.g. for predicting bars, we achieve coverage errors of 11.8% within a vote fraction deviation of 0.2) and hence are reliable for practical use. Further, using our posteriors, we apply the active learning strategy BALD to request volunteer responses for the subset of galaxies which, if labelled, would be most informative for training our network. We show that training our Bayesian CNNs using active learning requires up to 35-60% fewer labelled galaxies, depending on the morphological feature being classified. By combining human and machine intelligence, Galaxy Zoo will be able to classify surveys of any conceivable scale on a timescale of weeks, providing massive and detailed morphology catalogues to support research into galaxy evolution.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes