Operational range bounding of spectroscopy models with anomaly detection
This addresses the need for reliable model deployment in critical applications like exoplanetary spectroscopy for the Ariel mission, but it is incremental as it applies existing anomaly detection methods to a specific domain.
The paper tackled the problem of ensuring safe operation of machine learning models by using anomaly detection to bound their operational ranges, showing that Isolation Forests on SHAP values effectively identify contexts where predictions are likely to fail, with coverage/error trade-offs evaluated under drift conditions.
Safe operation of machine learning models requires architectures that explicitly delimit their operational ranges. We evaluate the ability of anomaly detection algorithms to provide indicators correlated with degraded model performance. By placing acceptance thresholds over such indicators, hard boundaries are formed that define the model's coverage. As a use case, we consider the extraction of exoplanetary spectra from transit light curves, specifically within the context of ESA's upcoming Ariel mission. Isolation Forests are shown to effectively identify contexts where prediction models are likely to fail. Coverage/error trade-offs are evaluated under conditions of data and concept drift. The best performance is seen when Isolation Forests model projections of the prediction model's explainability SHAP values.