40.5LGApr 23
Conditional anomaly detection with soft harmonic functionsMichal Valko, Branislav Kveton, Hamed Valizadegan et al.
In this paper, we consider the problem of conditional anomaly detection that aims to identify data instances with an unusual response or a class label. We develop a new non-parametric approach for conditional anomaly detection based on the soft harmonic solution, with which we estimate the confidence of the label to detect anomalous mislabeling. We further regularize the solution to avoid the detection of isolated examples and examples on the boundary of the distribution support. We demonstrate the efficacy of the proposed method on several synthetic and UCI ML datasets in detecting unusual labels when compared to several baseline approaches. We also evaluate the performance of our method on a real-world electronic health record dataset where we seek to identify unusual patient-management decisions.
66.8LGApr 23
Conditional anomaly detection using soft harmonic functions: An application to clinical alertingMichal Valko, Hamed Valizadegan, Branislav Kveton et al.
Timely detection of concerning events is an important problem in clinical practice. In this paper, we consider the problem of conditional anomaly detection that aims to identify data instances with an unusual response, such as the omission of an important lab test. We develop a new non-parametric approach for conditional anomaly detection based on the soft harmonic solution, with which we estimate the confidence of the label to detect anomalous mislabeling. We further regularize the solution to avoid the detection of isolated examples and examples on the boundary of the distribution support. We demonstrate the efficacy of the proposed method in detecting unusual labels on a real-world electronic health record dataset and compare it to several baseline approaches.
EPJan 21
ExoMiner++ 2.0: Vetting TESS Full-Frame Image Transit SignalsMiguel J. S. Martinho, Hamed Valizadegan, Jon M. Jenkins et al.
The Transiting Exoplanet Survey Satellite (TESS) Full-Frame Images (FFIs) provide photometric time series for millions of stars, enabling transit searches beyond the limited set of pre-selected 2-minute targets. However, FFIs present additional challenges for transit identification and vetting. In this work, we apply ExoMiner++ 2.0, an adaptation of the ExoMiner++ framework originally developed for TESS 2-minute data, to FFI light curves. The model is used to perform large-scale planet versus non-planet classification of Threshold Crossing Events across the sectors analyzed in this study. We construct a uniform vetting catalog of all evaluated signals and assess model performance under different observing conditions. We find that ExoMiner++ 2.0 generalizes effectively to the FFI domain, providing robust discrimination between planetary signals, astrophysical false positives, and instrumental artifacts despite the limitations inherent to longer cadence data. This work extends the applicability of ExoMiner++ to the full TESS dataset and supports future population studies and follow-up prioritization.
IMJul 24, 2024
TelescopeML -- I. An End-to-End Python Package for Interpreting Telescope Datasets through Training Machine Learning Models, Generating Statistical Reports, and Visualizing ResultsEhsan, Gharib-Nezhad, Natasha E. Batalha et al.
We are on the verge of a revolutionary era in space exploration, thanks to advancements in telescopes such as the James Webb Space Telescope (\textit{JWST}). High-resolution, high signal-to-noise spectra from exoplanet and brown dwarf atmospheres have been collected over the past few decades, requiring the development of accurate and reliable pipelines and tools for their analysis. Accurately and swiftly determining the spectroscopic parameters from the observational spectra of these objects is crucial for understanding their atmospheric composition and guiding future follow-up observations. \texttt{TelescopeML} is a Python package developed to perform three main tasks: 1. Process the synthetic astronomical datasets for training a CNN model and prepare the observational dataset for later use for prediction; 2. Train a CNN model by implementing the optimal hyperparameters; and 3. Deploy the trained CNN models on the actual observational data to derive the output spectroscopic parameters.
EPFeb 13, 2025
ExoMiner++: Enhanced Transit Classification and a New Vetting Catalog for 2-Minute TESS DataHamed Valizadegan, Miguel J. S. Martinho, Jon M. Jenkins et al.
We present ExoMiner++, an enhanced deep learning model that builds on the success of ExoMiner to improve transit signal classification in 2-minute TESS data. ExoMiner++ incorporates additional diagnostic inputs, including periodogram, flux trend, difference image, unfolded flux, and spacecraft attitude control data, all of which are crucial for effectively distinguishing transit signals from more challenging sources of false positives. To further enhance performance, we leverage multi-source training by combining high-quality labeled data from the Kepler space telescope with TESS data. This approach mitigates the impact of TESS's noisier and more ambiguous labels. ExoMiner++ achieves high accuracy across various classification and ranking metrics, significantly narrowing the search space for follow-up investigations to confirm new planets. To serve the exoplanet community, we introduce new TESS catalog containing ExoMiner++ classifications and confidence scores for each transit signal. Among the 147,568 unlabeled TCEs, ExoMiner++ identifies 7,330 as planet candidates, with the remainder classified as false positives. These 7,330 planet candidates correspond to 1,868 existing TESS Objects of Interest (TOIs), 69 Community TESS Objects of Interest (CTOIs), and 50 newly introduced CTOIs. 1,797 out of the 2,506 TOIs previously labeled as planet candidates in ExoFOP are classified as planet candidates by ExoMiner++. This reduction in plausible candidates combined with the excellent ranking quality of ExoMiner++ allows the follow-up efforts to be focused on the most likely candidates, increasing the overall planet yield.
EPMay 4, 2023
Multiplicity Boost Of Transit Signal Classifiers: Validation of 69 New Exoplanets Using The Multiplicity Boost of ExoMinerHamed Valizadegan, Miguel J. S. Martinho, Jon M. Jenkins et al.
Most existing exoplanets are discovered using validation techniques rather than being confirmed by complementary observations. These techniques generate a score that is typically the probability of the transit signal being an exoplanet (y(x)=exoplanet) given some information related to that signal (represented by x). Except for the validation technique in Rowe et al. (2014) that uses multiplicity information to generate these probability scores, the existing validation techniques ignore the multiplicity boost information. In this work, we introduce a framework with the following premise: given an existing transit signal vetter (classifier), improve its performance using multiplicity information. We apply this framework to several existing classifiers, which include vespa (Morton et al. 2016), Robovetter (Coughlin et al. 2017), AstroNet (Shallue & Vanderburg 2018), ExoNet (Ansdel et al. 2018), GPC and RFC (Armstrong et al. 2020), and ExoMiner (Valizadegan et al. 2022), to support our claim that this framework is able to improve the performance of a given classifier. We then use the proposed multiplicity boost framework for ExoMiner V1.2, which addresses some of the shortcomings of the original ExoMiner classifier (Valizadegan et al. 2022), and validate 69 new exoplanets for systems with multiple KOIs from the Kepler catalog.
EPNov 19, 2021
ExoMiner: A Highly Accurate and Explainable Deep Learning Classifier that Validates 301 New ExoplanetsHamed Valizadegan, Miguel Martinho, Laurent S. Wilkens et al.
The kepler and TESS missions have generated over 100,000 potential transit signals that must be processed in order to create a catalog of planet candidates. During the last few years, there has been a growing interest in using machine learning to analyze these data in search of new exoplanets. Different from the existing machine learning works, ExoMiner, the proposed deep learning classifier in this work, mimics how domain experts examine diagnostic tests to vet a transit signal. ExoMiner is a highly accurate, explainable, and robust classifier that 1) allows us to validate 301 new exoplanets from the MAST Kepler Archive and 2) is general enough to be applied across missions such as the on-going TESS mission. We perform an extensive experimental study to verify that ExoMiner is more reliable and accurate than the existing transit signal classifiers in terms of different classification and ranking metrics. For example, for a fixed precision value of 99%, ExoMiner retrieves 93.6% of all exoplanets in the test set (i.e., recall=0.936) while this rate is 76.3% for the best existing classifier. Furthermore, the modular design of ExoMiner favors its explainability. We introduce a simple explainability framework that provides experts with feedback on why ExoMiner classifies a transit signal into a specific class label (e.g., planet candidate or not planet candidate).
LGSep 2, 2013
Relative Comparison Kernel Learning with Auxiliary KernelsEric Heim, Hamed Valizadegan, Milos Hauskrecht
In this work we consider the problem of learning a positive semidefinite kernel matrix from relative comparisons of the form: "object A is more similar to object B than it is to C", where comparisons are given by humans. Existing solutions to this problem assume many comparisons are provided to learn a high quality kernel. However, this can be considered unrealistic for many real-world tasks since relative assessments require human input, which is often costly or difficult to obtain. Because of this, only a limited number of these comparisons may be provided. In this work, we explore methods for aiding the process of learning a kernel with the help of auxiliary kernels built from more easily extractable information regarding the relationships among objects. We propose a new kernel learning approach in which the target kernel is defined as a conic combination of auxiliary kernels and a kernel whose elements are learned directly. We formulate a convex optimization to solve for this target kernel that adds only minor overhead to methods that use no auxiliary information. Empirical results show that in the presence of few training relative comparisons, our method can learn kernels that generalize to more out-of-sample comparisons than methods that do not utilize auxiliary information, as well as similar methods that learn metrics over objects.