Tatiana Konstantinova

LG
h-index21
8papers
37citations
Novelty40%
AI Score50

8 Papers

LGMay 28
A Fully Convolutional Approach to Denoising Structural Dynamics Data from X-Ray Photon Correlation Spectroscopy

Nisar Nellikunnummel, Andi Barbour, Lutz Wiegart et al.

We present a fully convolutional denoising autoencoder (FC-DAE) for denoising two-time intensity-intensity correlation functions ($C_2$) in X-ray photon correlation spectroscopy (XPCS). Unlike conventional denoising autoencoders that are typically restricted to fixed input sizes, the FC-DAE accepts inputs of arbitrary dimensions while preserving correlation structures across diverse dynamical regimes. The model is trained using experimentally derived $C_2$ data collected at NSLS-II beamlines, with data augmentation applied to expand the diversity of the dataset and reduce overfitting. The FC-DAE successfully recovers intricate dynamical features in low signal-to-noise conditions while maintaining structural fidelity. To assess reconstruction reliability, we employ quantitative metrics to evaluate structural fidelity and identify potential model-induced bias. Our results demonstrate that the FC-DAE provides robust denoising performance with high computational efficiency, enabling recovery of XPCS dynamics under photon-limited and low-dose measurement conditions.

LGMar 16
Time-Aware Prior Fitted Networks for Zero-Shot Forecasting with Exogenous Variables

Andres Potapczynski, Ravi Kiran Selvam, Tatiana Konstantinova et al.

In many time series forecasting settings, the target time series is accompanied by exogenous covariates, such as promotions and prices in retail demand; temperature in energy load; calendar and holiday indicators for traffic or sales; and grid load or fuel costs in electricity pricing. Ignoring these exogenous signals can substantially degrade forecasting accuracy, particularly when they drive spikes, discontinuities, or regime and phase changes in the target series. Most current time series foundation models (e.g., Chronos, Sundial, TimesFM, TimeMoE, TimeLLM, and LagLlama) ignore exogenous covariates and make forecasts solely from the numerical time series history, thereby limiting their performance. In this paper, we develop ApolloPFN, a prior-data fitted network (PFN) that is time-aware (unlike prior PFNs) and that natively incorporates exogenous covariates (unlike prior univariate forecasters). Our design introduces two major advances: (i) a synthetic data generation procedure tailored to resolve the failure modes that arise when tabular (non-temporal) PFNs are applied to time series; and (ii) time-aware architectural modifications that embed inductive biases needed to exploit the time series context. We demonstrate that ApolloPFN achieves state-of-the-art results across benchmarks, such as M5 and electric price forecasting, that contain exogenous information.

LGJan 2
Zero-shot Forecasting by Simulation Alone

Boris N. Oreshkin, Mayank Jauhari, Ravi Kiran Selvam et al.

Zero-shot time-series forecasting holds great promise, but is still in its infancy, hindered by limited and biased data corpora, leakage-prone evaluation, and privacy and licensing constraints. Motivated by these challenges, we propose the first practical univariate time series simulation pipeline which is simultaneously fast enough for on-the-fly data generation and enables notable zero-shot forecasting performance on M-Series and GiftEval benchmarks that capture trend/seasonality/intermittency patterns, typical of industrial forecasting applications across a variety of domains. Our simulator, which we call SarSim0 (SARIMA Simulator for Zero-Shot Forecasting), is based off of a seasonal autoregressive integrated moving average (SARIMA) model as its core data source. Due to instability in the autoregressive component, naive SARIMA simulation often leads to unusable paths. Instead, we follow a three-step procedure: (1) we sample well-behaved trajectories from its characteristic polynomial stability region; (2) we introduce a superposition scheme that combines multiple paths into rich multi-seasonality traces; and (3) we add rate-based heavy-tailed noise models to capture burstiness and intermittency alongside seasonalities and trends. SarSim0 is orders of magnitude faster than kernel-based generators, and it enables training on circa 1B unique purely simulated series, generated on the fly; after which well-established neural network backbones exhibit strong zero-shot generalization, surpassing strong statistical forecasters and recent foundation baselines, while operating under strict zero-shot protocol. Notably, on GiftEval we observe a "student-beats-teacher" effect: models trained on our simulations exceed the forecasting accuracy of the AutoARIMA generating processes.

LGSep 23, 2025
A More Realistic Evaluation of Cross-Frequency Transfer Learning and Foundation Forecasting Models

Kin G. Olivares, Malcolm Wolff, Tatiana Konstantinova et al.

Cross-frequency transfer learning (CFTL) has emerged as a popular framework for curating large-scale time series datasets to pre-train foundation forecasting models (FFMs). Although CFTL has shown promise, current benchmarking practices fall short of accurately assessing its performance. This shortcoming stems from many factors: an over-reliance on small-scale evaluation datasets; inadequate treatment of sample size when computing summary statistics; reporting of suboptimal statistical models; and failing to account for non-negligible risks of overlap between pre-training and test datasets. To address these limitations, we introduce a unified reimplementation of widely-adopted neural forecasting networks, adapting them for the CFTL setup; we pre-train only on proprietary and synthetic data, being careful to prevent test leakage; and we evaluate on 15 large, diverse public forecast competition datasets. Our empirical analysis reveals that statistical models' accuracy is frequently underreported. Notably, we confirm that statistical models and their ensembles consistently outperform existing FFMs by more than 8.2% in sCRPS, and by more than 20% MASE, across datasets. However, we also find that synthetic dataset pre-training does improve the accuracy of a FFM by 7% percent.

LGOct 2, 2025
Efficiently Generating Correlated Sample Paths from Multi-step Time Series Foundation Models

Ethan Baron, Boris Oreshkin, Ruijun Ma et al.

Many time series applications require access to multi-step forecast trajectories in the form of sample paths. Recently, time series foundation models have leveraged multi-step lookahead predictions to improve the quality and efficiency of multi-step forecasts. However, these models only predict independent marginal distributions for each time step, rather than a full joint predictive distribution. To generate forecast sample paths with realistic correlation structures, one typically resorts to autoregressive sampling, which can be extremely expensive. In this paper, we present a copula-based approach to efficiently generate accurate, correlated sample paths from existing multi-step time series foundation models in one forward pass. Our copula-based approach generates correlated sample paths orders of magnitude faster than autoregressive sampling, and it yields improved sample path quality by mitigating the snowballing error phenomenon.

SPJan 17, 2022
Machine Learning Enhances Algorithms for Quantifying Non-Equilibrium Dynamics in Correlation Spectroscopy Experiments to Reach Frame-Rate-Limited Time Resolution

Tatiana Konstantinova, Lutz Wiegart, Maksim Rakitin et al.

Analysis of X-ray Photon Correlation Spectroscopy (XPCS) data for non-equilibrium dynamics often requires manual binning of age regions of an intensity-intensity correlation function. This leads to a loss of temporal resolution and accumulation of systematic error for the parameters quantifying the dynamics, especially in cases with considerable noise. Moreover, the experiments with high data collection rates create the need for automated online analysis, where manual binning is not possible. Here, we integrate a denoising autoencoder model into algorithms for analysis of non-equilibrium two-time intensity-intensity correlation functions. The model can be applied to an input of an arbitrary size. Noise reduction allows to extract the parameters that characterize the sample dynamics with temporal resolution limited only by frame rates. Not only does it improve the quantitative usage of the data, but it also creates the potential for automating the analytical workflow. Various approaches for uncertainty quantification and extension of the model for anomalies detection are discussed.

LGJan 9, 2022
Machine learning enabling high-throughput and remote operations at large-scale user facilities

Tatiana Konstantinova, Phillip M. Maffettone, Bruce Ravel et al.

Imaging, scattering, and spectroscopy are fundamental in understanding and discovering new functional materials. Contemporary innovations in automation and experimental techniques have led to these measurements being performed much faster and with higher resolution, thus producing vast amounts of data for analysis. These innovations are particularly pronounced at user facilities and synchrotron light sources. Machine learning (ML) methods are regularly developed to process and interpret large datasets in real-time with measurements. However, there remain conceptual barriers to entry for the facility general user community, whom often lack expertise in ML, and technical barriers for deploying ML models. Herein, we demonstrate a variety of archetypal ML models for on-the-fly analysis at multiple beamlines at the National Synchrotron Light Source II (NSLS-II). We describe these examples instructively, with a focus on integrating the models into existing experimental workflows, such that the reader can easily include their own ML techniques into experiments at NSLS-II or facilities with a common infrastructure. The framework presented here shows how with little effort, diverse ML models operate in conjunction with feedback loops via integration into the existing Bluesky Suite for experimental orchestration and data management.

MTRL-SCIFeb 7, 2021
Noise Reduction in X-ray Photon Correlation Spectroscopy with Convolutional Neural Networks Encoder-Decoder Models

Tatiana Konstantinova, Lutz Wiegart, Maksim Rakitin et al.

Like other experimental techniques, X-ray Photon Correlation Spectroscopy is subject to various kinds of noise. Random and correlated fluctuations and heterogeneities can be present in a two-time correlation function and obscure the information about the intrinsic dynamics of a sample. Simultaneously addressing the disparate origins of noise in the experimental data is challenging. We propose a computational approach for improving the signal-to-noise ratio in two-time correlation functions that is based on Convolutional Neural Network Encoder-Decoder (CNN-ED) models. Such models extract features from an image via convolutional layers, project them to a low dimensional space and then reconstruct a clean image from this reduced representation via transposed convolutional layers. Not only are ED models a general tool for random noise removal, but their application to low signal-to-noise data can enhance the data quantitative usage since they are able to learn the functional form of the signal. We demonstrate that the CNN-ED models trained on real-world experimental data help to effectively extract equilibrium dynamics parameters from two-time correlation functions, containing statistical noise and dynamic heterogeneities. Strategies for optimizing the models performance and their applicability limits are discussed.