Exoplanet Detection by Machine Learning with Data Augmentation
This addresses the challenge of small datasets for researchers in astronomy using machine learning for exoplanet detection, but it is incremental as it applies existing augmentation methods to a specific domain.
The paper tackled the problem of limited datasets hindering deep learning performance in exoplanet detection from light curve data, and found that data augmentation techniques, including simple and learning-based methods, can improve model performance.
It has recently been demonstrated that deep learning has significant potential to automate parts of the exoplanet detection pipeline using light curve data from satellites such as Kepler \cite{borucki2010kepler} \cite{koch2010kepler} and NASA's Transiting Exoplanet Survey Satellite (TESS) \cite{ricker2010transiting}. Unfortunately, the smallness of the available datasets makes it difficult to realize the level of performance one expects from powerful network architectures. In this paper, we investigate the use of data augmentation techniques on light curve data from to train neural networks to identify exoplanets. The augmentation techniques used are of two classes: Simple (e.g. additive noise augmentation) and learning-based (e.g. first training a GAN \cite{goodfellow2020generative} to generate new examples). We demonstrate that data augmentation has a potential to improve model performance for the exoplanet detection problem, and recommend the use of augmentation based on generative models as more data becomes available.