EP IM LGNov 23, 2020

Peeking inside the Black Box: Interpreting Deep Learning Models for Exoplanet Atmospheric Retrievals

Kai Hou Yip, Quentin Changeat, Nikolaos Nikolaou, Mario Morvan, Billy Edwards, Ingo P. Waldmann, Giovanna Tinetti

arXiv:2011.11284v28.022 citationsHas Code

Originality Incremental advance

AI Analysis

This work provides methodologies to interpret deep learning models, which is crucial for increasing the astrophysics community's trust and adoption of these models for exoplanet atmospheric retrievals.

This paper addresses the 'black box' nature of deep learning models used for exoplanet atmospheric retrievals by presenting general evaluation methodologies. They trained three DNN architectures to retrieve atmospheric parameters from exoplanet spectra, achieving good predictive performance, and then analyzed their predictions to determine credibility limits and identify spectral features most sensitive to retrieval outcomes.

Deep learning algorithms are growing in popularity in the field of exoplanetary science due to their ability to model highly non-linear relations and solve interesting problems in a data-driven manner. Several works have attempted to perform fast retrievals of atmospheric parameters with the use of machine learning algorithms like deep neural networks (DNNs). Yet, despite their high predictive power, DNNs are also infamous for being 'black boxes'. It is their apparent lack of explainability that makes the astrophysics community reluctant to adopt them. What are their predictions based on? How confident should we be in them? When are they wrong and how wrong can they be? In this work, we present a number of general evaluation methodologies that can be applied to any trained model and answer questions like these. In particular, we train three different popular DNN architectures to retrieve atmospheric parameters from exoplanet spectra and show that all three achieve good predictive performance. We then present an extensive analysis of the predictions of DNNs, which can inform us - among other things - of the credibility limits for atmospheric parameters for a given instrument and model. Finally, we perform a perturbation-based sensitivity analysis to identify to which features of the spectrum the outcome of the retrieval is most sensitive. We conclude that for different molecules, the wavelength ranges to which the DNN's predictions are most sensitive, indeed coincide with their characteristic absorption regions. The methodologies presented in this work help to improve the evaluation of DNNs and to grant interpretability to their predictions.

View on arXiv PDF Code

Similar