IM LGFeb 3, 2020

Scalable End-to-end Recurrent Neural Network for Variable star classification

Ignacio Becker, Karim Pichara, Márcio Catelan, Pavlos Protopapas, Carlos Aguirre, Fatemeh Nikzat

arXiv:2002.00994v18.033 citationsh-index: 51Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for scalable and efficient classification methods for variable stars in large astronomical datasets like LSST, offering a faster and more computationally efficient alternative to traditional feature-based approaches.

The authors tackled the problem of automatic variable star classification by proposing an end-to-end deep learning algorithm based on Recurrent Neural Networks that learns representations directly from light curves, achieving accuracies of about 95% in main classes and 75% in most subclasses across three surveys.

During the last decade, considerable effort has been made to perform automatic classification of variable stars using machine learning techniques. Traditionally, light curves are represented as a vector of descriptors or features used as input for many algorithms. Some features are computationally expensive, cannot be updated quickly and hence for large datasets such as the LSST cannot be applied. Previous work has been done to develop alternative unsupervised feature extraction algorithms for light curves, but the cost of doing so still remains high. In this work, we propose an end-to-end algorithm that automatically learns the representation of light curves that allows an accurate automatic classification. We study a series of deep learning architectures based on Recurrent Neural Networks and test them in automated classification scenarios. Our method uses minimal data preprocessing, can be updated with a low computational cost for new observations and light curves, and can scale up to massive datasets. We transform each light curve into an input matrix representation whose elements are the differences in time and magnitude, and the outputs are classification probabilities. We test our method in three surveys: OGLE-III, Gaia and WISE. We obtain accuracies of about $95\%$ in the main classes and $75\%$ in the majority of subclasses. We compare our results with the Random Forest classifier and obtain competitive accuracies while being faster and scalable. The analysis shows that the computational complexity of our approach grows up linearly with the light curve size, while the traditional approach cost grows as $N\log{(N)}$.

View on arXiv PDF Code

Similar