C. Donoso-Oliva

2papers

2 Papers

IMMay 2, 2022

ASTROMER: A transformer-based embedding for the representation of light curves

C. Donoso-Oliva, I. Becker, P. Protopapas et al.

Taking inspiration from natural language embeddings, we present ASTROMER, a transformer-based model to create representations of light curves. ASTROMER was pre-trained in a self-supervised manner, requiring no human-labeled data. We used millions of R-band light sequences to adjust the ASTROMER weights. The learned representation can be easily adapted to other surveys by re-training ASTROMER on new sources. The power of ASTROMER consists of using the representation to extract light curve embeddings that can enhance the training of other models, such as classifiers or regressors. As an example, we used ASTROMER embeddings to train two neural-based classifiers that use labeled variable stars from MACHO, OGLE-III, and ATLAS. In all experiments, ASTROMER-based classifiers outperformed a baseline recurrent neural network trained on light curves directly when limited labeled data was available. Furthermore, using ASTROMER embeddings decreases computational resources needed while achieving state-of-the-art results. Finally, we provide a Python library that includes all the functionalities employed in this work. The library, main code, and pre-trained weights are available at https://github.com/astromer-science

IMJun 7, 2021

The effect of phased recurrent units in the classification of multiple catalogs of astronomical lightcurves

C. Donoso-Oliva, G. Cabrera-Vives, P. Protopapas et al.

In the new era of very large telescopes, where data is crucial to expand scientific knowledge, we have witnessed many deep learning applications for the automatic classification of lightcurves. Recurrent neural networks (RNNs) are one of the models used for these applications, and the LSTM unit stands out for being an excellent choice for the representation of long time series. In general, RNNs assume observations at discrete times, which may not suit the irregular sampling of lightcurves. A traditional technique to address irregular sequences consists of adding the sampling time to the network's input, but this is not guaranteed to capture sampling irregularities during training. Alternatively, the Phased LSTM unit has been created to address this problem by updating its state using the sampling times explicitly. In this work, we study the effectiveness of the LSTM and Phased LSTM based architectures for the classification of astronomical lightcurves. We use seven catalogs containing periodic and nonperiodic astronomical objects. Our findings show that LSTM outperformed PLSTM on 6/7 datasets. However, the combination of both units enhances the results in all datasets.