Weakly-Supervised Anomaly Detection in the Milky Way
This work addresses the challenge of identifying localized anomalies in large-scale astrophysics datasets for researchers, though it is incremental as it applies an existing method to a new domain.
The authors tackled the problem of detecting cold stellar streams in the Milky Way using a weakly-supervised anomaly detection method, achieving detection of both simulated streams and the known stream GD-1 in Gaia satellite data with over one billion stars.
Large-scale astrophysics datasets present an opportunity for new machine learning techniques to identify regions of interest that might otherwise be overlooked by traditional searches. To this end, we use Classification Without Labels (CWoLa), a weakly-supervised anomaly detection method, to identify cold stellar streams within the more than one billion Milky Way stars observed by the Gaia satellite. CWoLa operates without the use of labeled streams or knowledge of astrophysical principles. Instead, we train a classifier to distinguish between mixed samples for which the proportions of signal and background samples are unknown. This computationally lightweight strategy is able to detect both simulated streams and the known stream GD-1 in data. Originally designed for high-energy collider physics, this technique may have broad applicability within astrophysics as well as other domains interested in identifying localized anomalies.