IMCOLGFeb 19, 2016

Stacking for machine learning redshifts applied to SDSS galaxies

arXiv:1602.06294v219 citations
Originality Synthesis-oriented
AI Analysis

This work addresses redshift estimation for astronomers, but it is incremental as it applies an existing stacking method to known machine learning algorithms.

The paper tackles photometric redshift estimation for SDSS galaxies by applying stacking, a technique that feeds model outputs back as inputs in subsequent rounds, and finds improvements of 1.9% to 21% for weak learners and 0.4% to 2.5% for strong learners like AdaBoost.

We present an analysis of a general machine learning technique called 'stacking' for the estimation of photometric redshifts. Stacking techniques can feed the photometric redshift estimate, as output by a base algorithm, back into the same algorithm as an additional input feature in a subsequent learning round. We shown how all tested base algorithms benefit from at least one additional stacking round (or layer). To demonstrate the benefit of stacking, we apply the method to both unsupervised machine learning techniques based on self-organising maps (SOMs), and supervised machine learning methods based on decision trees. We explore a range of stacking architectures, such as the number of layers and the number of base learners per layer. Finally we explore the effectiveness of stacking even when using a successful algorithm such as AdaBoost. We observe a significant improvement of between 1.9% and 21% on all computed metrics when stacking is applied to weak learners (such as SOMs and decision trees). When applied to strong learning algorithms (such as AdaBoost) the ratio of improvement shrinks, but still remains positive and is between 0.4% and 2.5% for the explored metrics and comes at almost no additional computational cost.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes