ASTROMER: A transformer-based embedding for the representation of light curves
This work addresses the challenge of analyzing variable star light curves for astronomers by providing an efficient embedding method, though it is incremental as it adapts transformer techniques to a specific domain.
The paper tackles the problem of representing astronomical light curves by introducing ASTROMER, a transformer-based model pre-trained on millions of unlabeled R-band light sequences, which enhances classifier performance with limited labeled data, achieving state-of-the-art results and reducing computational resources.
Taking inspiration from natural language embeddings, we present ASTROMER, a transformer-based model to create representations of light curves. ASTROMER was pre-trained in a self-supervised manner, requiring no human-labeled data. We used millions of R-band light sequences to adjust the ASTROMER weights. The learned representation can be easily adapted to other surveys by re-training ASTROMER on new sources. The power of ASTROMER consists of using the representation to extract light curve embeddings that can enhance the training of other models, such as classifiers or regressors. As an example, we used ASTROMER embeddings to train two neural-based classifiers that use labeled variable stars from MACHO, OGLE-III, and ATLAS. In all experiments, ASTROMER-based classifiers outperformed a baseline recurrent neural network trained on light curves directly when limited labeled data was available. Furthermore, using ASTROMER embeddings decreases computational resources needed while achieving state-of-the-art results. Finally, we provide a Python library that includes all the functionalities employed in this work. The library, main code, and pre-trained weights are available at https://github.com/astromer-science