Katsiaryna Haitsiukevich

LG
h-index6
8papers
89citations
Novelty50%
AI Score47

8 Papers

LGApr 11, 2022
Improved Training of Physics-Informed Neural Networks with Model Ensembles

Katsiaryna Haitsiukevich, Alexander Ilin

Learning the solution of partial differential equations (PDEs) with a neural network is an attractive alternative to traditional solvers due to its elegance, greater flexibility and the ease of incorporating observed data. However, training such physics-informed neural networks (PINNs) is notoriously difficult in practice since PINNs often converge to wrong solutions. In this paper, we address this problem by training an ensemble of PINNs. Our approach is motivated by the observation that individual PINN models find similar solutions in the vicinity of points with targets (e.g., observed data or initial conditions) while their solutions may substantially differ farther away from such points. Therefore, we propose to use the ensemble agreement as the criterion for gradual expansion of the solution interval, that is including new points for computing the loss derived from differential equations. Due to the flexibility of the domain expansion, our algorithm can easily incorporate measurements in arbitrary locations. In contrast to the existing PINN algorithms with time-adaptive strategies, the proposed algorithm does not need a pre-defined schedule of interval expansion and it treats time and space equally. We experimentally show that the proposed algorithm can stabilize PINN training and yield performance competitive to the recent variants of PINNs trained with time adaptation.

AIApr 11, 2022
Learning Trajectories of Hamiltonian Systems with Neural Networks

Katsiaryna Haitsiukevich, Alexander Ilin

Modeling of conservative systems with neural networks is an area of active research. A popular approach is to use Hamiltonian neural networks (HNNs) which rely on the assumptions that a conservative system is described with Hamilton's equations of motion. Many recent works focus on improving the integration schemes used when training HNNs. In this work, we propose to enhance HNNs with an estimation of a continuous-time trajectory of the modeled system using an additional neural network, called a deep hidden physics model in the literature. We demonstrate that the proposed integration scheme works well for HNNs, especially with low sampling rates, noisy and irregular observations.

LGAug 26, 2025Code
GRADSTOP: Early Stopping of Gradient Descent via Posterior Sampling

Arash Jamshidi, Lauri Seppäläinen, Katsiaryna Haitsiukevich et al.

Machine learning models are often learned by minimising a loss function on the training data using a gradient descent algorithm. These models often suffer from overfitting, leading to a decline in predictive performance on unseen data. A standard solution is early stopping using a hold-out validation set, which halts the minimisation when the validation loss stops decreasing. However, this hold-out set reduces the data available for training. This paper presents GRADSTOP, a novel stochastic early stopping method that only uses information in the gradients, which are produced by the gradient descent algorithm ``for free.'' Our main contributions are that we estimate the Bayesian posterior by the gradient information, define the early stopping problem as drawing sample from this posterior, and use the approximated posterior to obtain a stopping criterion. Our empirical evaluation shows that GRADSTOP achieves a small loss on test data and compares favourably to a validation-set-based stopping criterion. By leveraging the entire dataset for training, our method is particularly advantageous in data-limited settings, such as transfer learning. It can be incorporated as an optional feature in gradient descent libraries with only a small computational overhead. The source code is available at https://github.com/edahelsinki/gradstop.

55.8HCMar 12
PhiPlot: A Web-Based Interactive EDA Environment for Atmospherically Relevant Molecules

Matias Loukojärvi, Ananth Mahadevan, Katsiaryna Haitsiukevich et al.

Advances in computational chemistry have produced high-dimensional datasets on atmospherically relevant molecules. To aid exploration of such datasets, particularly for the study of atmospheric aerosol formation, we introduce PhiPlot: a web-based environment for interactive exploration and knowledge-based dimensionality reduction. The integration of visualisation, clustering, and domain knowledge-guided embedding refinement enables the discovery of patterns in the data and supports hypothesis generation. The application connects to an existing, evolving collection of molecular databases, offering an accessible interface for data-driven research in atmospheric chemistry.

LGJan 26
Information Hidden in Gradients of Regression with Target Noise

Arash Jamshidi, Katsiaryna Haitsiukevich, Kai Puolamäki

Second-order information -- such as curvature or data covariance -- is critical for optimisation, diagnostics, and robustness. However, in many modern settings, only the gradients are observable. We show that the gradients alone can reveal the Hessian, equalling the data covariance $Σ$ for the linear regression. Our key insight is a simple variance calibration: injecting Gaussian noise so that the total target noise variance equals the batch size ensures that the empirical gradient covariance closely approximates the Hessian, even when evaluated far from the optimum. We provide non-asymptotic operator-norm guarantees under sub-Gaussian inputs. We also show that without such calibration, recovery can fail by an $Ω(1)$ factor. The proposed method is practical (a "set target-noise variance to $n$" rule) and robust (variance $\mathcal{O}(n)$ suffices to recover $Σ$ up to scale). Applications include preconditioning for faster optimisation, adversarial risk estimation, and gradient-only training, for example, in distributed systems. We support our theoretical results with experiments on synthetic and real data.

CLApr 29, 2024
In-Context Symbolic Regression: Leveraging Large Language Models for Function Discovery

Matteo Merler, Katsiaryna Haitsiukevich, Nicola Dainese et al.

State of the art Symbolic Regression (SR) methods currently build specialized models, while the application of Large Language Models (LLMs) remains largely unexplored. In this work, we introduce the first comprehensive framework that utilizes LLMs for the task of SR. We propose In-Context Symbolic Regression (ICSR), an SR method which iteratively refines a functional form with an LLM and determines its coefficients with an external optimizer. ICSR leverages LLMs' strong mathematical prior both to propose an initial set of possible functions given the observations and to refine them based on their errors. Our findings reveal that LLMs are able to successfully find symbolic equations that fit the given data, matching or outperforming the overall performance of the best SR baselines on four popular benchmarks, while yielding simpler equations with better out of distribution generalization.

LGMay 11, 2024
Diffusion models as probabilistic neural operators for recovering unobserved states of dynamical systems

Katsiaryna Haitsiukevich, Onur Poyraz, Pekka Marttinen et al.

This paper explores the efficacy of diffusion-based generative models as neural operators for partial differential equations (PDEs). Neural operators are neural networks that learn a mapping from the parameter space to the solution space of PDEs from data, and they can also solve the inverse problem of estimating the parameter from the solution. Diffusion models excel in many domains, but their potential as neural operators has not been thoroughly explored. In this work, we show that diffusion-based generative models exhibit many properties favourable for neural operators, and they can effectively generate the solution of a PDE conditionally on the parameter or recover the unobserved parts of the system. We propose to train a single model adaptable to multiple tasks, by alternating between the tasks during training. In our experiments with multiple realistic dynamical systems, diffusion models outperform other neural operators. Furthermore, we demonstrate how the probabilistic diffusion model can elegantly deal with systems which are only partially identifiable, by producing samples corresponding to the different possible solutions.

LGDec 13, 2021
A Grid-Structured Model of Tubular Reactors

Katsiaryna Haitsiukevich, Samuli Bergman, Cesar de Araujo Filho et al.

We propose a grid-like computational model of tubular reactors. The architecture is inspired by the computations performed by solvers of partial differential equations which describe the dynamics of the chemical process inside a tubular reactor. The proposed model may be entirely based on the known form of the partial differential equations or it may contain generic machine learning components such as multi-layer perceptrons. We show that the proposed model can be trained using limited amounts of data to describe the state of a fixed-bed catalytic reactor. The trained model can reconstruct unmeasured states such as the catalyst activity using the measurements of inlet concentrations and temperatures along the reactor.