Konstantin Malanchev

IM
h-index32
9papers
32citations
Novelty32%
AI Score43

9 Papers

IMSep 15, 2022
Understanding of the properties of neural network approaches for transient light curve approximations

Mariia Demianenko, Konstantin Malanchev, Ekaterina Samorodova et al.

Modern-day time-domain photometric surveys collect a lot of observations of various astronomical objects and the coming era of large-scale surveys will provide even more information on their properties. Spectroscopic follow-ups are especially crucial for transients such as supernovae and most of these objects have not been subject to such studies. }{Flux time series are actively used as an affordable alternative for photometric classification and characterization, for instance, peak identifications and luminosity decline estimations. However, the collected time series are multidimensional and irregularly sampled, while also containing outliers and without any well-defined systematic uncertainties. This paper presents a search for the best-performing methods to approximate the observed light curves over time and wavelength for the purpose of generating time series with regular time steps in each passband.}{We examined several light curve approximation methods based on neural networks such as multilayer perceptrons, Bayesian neural networks, and normalizing flows to approximate observations of a single light curve. Test datasets include simulated PLAsTiCC and real Zwicky Transient Facility Bright Transient Survey light curves of transients.}{The tests demonstrate that even just a few observations are enough to fit the networks and improve the quality of approximation, compared to state-of-the-art models. The methods described in this work have a low computational complexity and are significantly faster than Gaussian processes. Additionally, we analyzed the performance of the approximation techniques from the perspective of further peak identification and transients classification. The study results have been released in an open and user-friendly Fulu Python library available on GitHub for the scientific community.

IMJun 27, 2022
Supernova Light Curves Approximation based on Neural Network Models

Mariia Demianenko, Ekaterina Samorodova, Mikhail Sysak et al.

Photometric data-driven classification of supernovae becomes a challenge due to the appearance of real-time processing of big data in astronomy. Recent studies have demonstrated the superior quality of solutions based on various machine learning models. These models learn to classify supernova types using their light curves as inputs. Preprocessing these curves is a crucial step that significantly affects the final quality. In this talk, we study the application of multilayer perceptron (MLP), bayesian neural network (BNN), and normalizing flows (NF) to approximate observations for a single light curve. We use these approximations as inputs for supernovae classification models and demonstrate that the proposed methods outperform the state-of-the-art based on Gaussian processes applying to the Zwicky Transient Facility Bright Transient Survey light curves. MLP demonstrates similar quality as Gaussian processes and speed increase. Normalizing Flows exceeds Gaussian processes in terms of approximation quality as well.

IMMay 18Code
Hyrax: An Extensible Framework for Rapid ML Experimentation and Unsupervised Discovery in the Era of Rubin, Roman, and Euclid

Aritra Ghosh, Drew Oldag, Michael Tauraso et al.

The NSF-DOE Vera C. Rubin Observatory, Roman Space Telescope, Euclid, and other next-generation surveys will deliver imaging, spectroscopic, and time-domain data at scales that increasingly shift the bottleneck in astronomical machine learning (ML) projects from model design to infrastructure. We present Hyrax, an open-source, modular, GPU-enabled Python framework that supports the full ML lifecycle in astronomy: from data acquisition and training to inference and experiment comparison, with capabilities including multimodal dataset support, integrated vector databases for similarity search, and interactive two- and three-dimensional latent-space exploration for unsupervised discovery. We demonstrate Hyrax's versatility through five representative applications on real survey data: (i) unsupervised representation learning on $\sim 4\times10^5$ Rubin Legacy Survey of Space and Time (LSST) Data Preview 1 (DP1) galaxies, surfacing new merger and low-surface-brightness candidates missing from reference Euclid and Dark Energy Survey catalogs, while also isolating imaging artifacts -- all without labeled training data; (ii) hybrid density-based clustering for identifying cluster-scale gravitational lens candidates in DP1 data; (iii) multimodal early-time transient classification in the Zwicky Transient Facility leveraging light curves, spectra, images, and metadata; (iv) supervised false-positive filtering in shift-and-stack searches for distant solar system objects in the Dark Energy Camera Ecliptic Exploration Project survey; and (v) supervised detection of semi-resolved dwarf galaxies in Hyper Suprime-Cam and LSST-like imaging using synthetic source injection. Together, these results demonstrate that Hyrax provides astronomy-specific ML infrastructure that enables systematic discovery and rapid methodological iteration across next-generation astronomical surveys.

IMJan 2, 2025Code
ORACLE: A Real-Time, Hierarchical, Deep-Learning Photometric Classifier for the LSST

Ved G. Shah, Alex Gagliano, Konstantin Malanchev et al.

We present ORACLE, the first hierarchical deep-learning model for real-time, context-aware classification of transient and variable astrophysical phenomena. ORACLE is a recurrent neural network with Gated Recurrent Units (GRUs), and has been trained using a custom hierarchical cross-entropy loss function to provide high-confidence classifications along an observationally-driven taxonomy with as little as a single photometric observation. Contextual information for each object, including host galaxy photometric redshift, offset, ellipticity and brightness, is concatenated to the light curve embedding and used to make a final prediction. Training on $\sim$0.5M events from the Extended LSST Astronomical Time-Series Classification Challenge, we achieve a top-level (Transient vs Variable) macro-averaged precision of 0.96 using only 1 day of photometric observations after the first detection in addition to contextual information, for each event; this increases to $>$0.99 once 64 days of the light curve has been obtained, and 0.83 at 1024 days after first detection for 19-way classification (including supernova sub-types, active galactic nuclei, variable stars, microlensing events, and kilonovae). We also compare ORACLE with other state-of-the-art classifiers and report comparable performance for the 19-way classification task, in addition to delivering accurate top-level classifications much earlier. The code and model weights used in this work are publicly available at our associated GitHub repository (https://github.com/uiucsn/ELAsTiCC-Classification).

LGFeb 6, 2024
Multi-View Symbolic Regression

Etienne Russeil, Fabrício Olivetti de França, Konstantin Malanchev et al.

Symbolic regression (SR) searches for analytical expressions representing the relationship between a set of explanatory and response variables. Current SR methods assume a single dataset extracted from a single experiment. Nevertheless, frequently, the researcher is confronted with multiple sets of results obtained from experiments conducted with different setups. Traditional SR methods may fail to find the underlying expression since the parameters of each experiment can be different. In this work we present Multi-View Symbolic Regression (MvSR), which takes into account multiple datasets simultaneously, mimicking experimental environments, and outputs a general parametric solution. This approach fits the evaluated expression to each independent dataset and returns a parametric family of functions f(x; theta) simultaneously capable of accurately fitting all datasets. We demonstrate the effectiveness of MvSR using data generated from known expressions, as well as real-world data from astronomy, chemistry and economy, for which an a priori analytical expression is not available. Results show that MvSR obtains the correct expression more frequently and is robust to hyperparameters change. In real-world data, it is able to grasp the group behavior, recovering known expressions from the literature as well as promising alternatives, thus enabling the use of SR to a large range of experimental scenarios.

IMOct 24, 2024
Exploring the Universe with SNAD: Anomaly Detection in Astronomy

Alina A. Volnova, Patrick D. Aleo, Anastasia Lavrukhina et al.

SNAD is an international project with a primary focus on detecting astronomical anomalies within large-scale surveys, using active learning and other machine learning algorithms. The work carried out by SNAD not only contributes to the discovery and classification of various astronomical phenomena but also enhances our understanding and implementation of machine learning techniques within the field of astrophysics. This paper provides a review of the SNAD project and summarizes the advancements and achievements made by the team over several years.

LGSep 1, 2025
Exploring Multi-view Symbolic Regression methods in physical sciences

Etienne Russeil, Fabrício Olivetti de França, Konstantin Malanchev et al.

Describing the world behavior through mathematical functions help scientists to achieve a better understanding of the inner mechanisms of different phenomena. Traditionally, this is done by deriving new equations from first principles and careful observations. A modern alternative is to automate part of this process with symbolic regression (SR). The SR algorithms search for a function that adequately fits the observed data while trying to enforce sparsity, in the hopes of generating an interpretable equation. A particularly interesting extension to these algorithms is the Multi-view Symbolic Regression (MvSR). It searches for a parametric function capable of describing multiple datasets generated by the same phenomena, which helps to mitigate the common problems of overfitting and data scarcity. Recently, multiple implementations added support to MvSR with small differences between them. In this paper, we test and compare MvSR as supported in Operon, PySR, phy-SO, and eggp, in different real-world datasets. We show that they all often achieve good accuracy while proposing solutions with only few free parameters. However, we find that certain features enable a more frequent generation of better models. We conclude by providing guidelines for future MvSR developments.

LGJun 19, 2025
Signatures to help interpretability of anomalies

Emmanuel Gangler, Emille E. O. Ishida, Matwey V. Kornilov et al.

Machine learning is often viewed as a black box when it comes to understanding its output, be it a decision or a score. Automatic anomaly detection is no exception to this rule, and quite often the astronomer is left to independently analyze the data in order to understand why a given event is tagged as an anomaly. We introduce here idea of anomaly signature, whose aim is to help the interpretability of anomalies by highlighting which features contributed to the decision.

IMJun 18, 2020
Photometric Data-driven Classification of Type Ia Supernovae in the Open Supernova Catalog

Stanislav Dobryakov, Konstantin Malanchev, Denis Derkach et al.

We propose a novel approach for a machine-learning-based detection of the type Ia supernovae using photometric information. Unlike other approaches, only real observation data is used during training. Despite being trained on a relatively small sample, the method shows good results on real data from the Open Supernovae Catalog. We also investigate model transfer from the PLAsTiCC simulations train dataset to real data application, and the reverse, and find the performance significantly decreases in both cases, highlighting the existing differences between simulated and real data.