Ekaterina Muravleva

LG
h-index8
10papers
106citations
Novelty45%
AI Score37

10 Papers

LGMay 24, 2024
Learning from Linear Algebra: A Graph Neural Network Approach to Preconditioner Design for Conjugate Gradient Solvers

Vladislav Trifonov, Alexander Rudikov, Oleg Iliev et al.

Large linear systems are ubiquitous in modern computational science and engineering. The main recipe for solving them is the use of Krylov subspace iterative methods with well-designed preconditioners. Recently, GNNs have been shown to be a promising tool for designing preconditioners to reduce the overall computational cost of iterative methods by constructing them more efficiently than with classical linear algebra techniques. Preconditioners designed with these approaches cannot outperform those designed with classical methods in terms of the number of iterations in CG. In our work, we recall well-established preconditioners from linear algebra and use them as a starting point for training the GNN to obtain preconditioners that reduce the condition number of the system more significantly than classical preconditioners. Numerical experiments show that our approach outperforms both classical and neural network-based methods for an important class of parametric partial differential equations. We also provide a heuristic justification for the loss function used and show that preconditioners obtained by learning with this loss function reduce the condition number in a more desirable way for CG.

LGApr 15, 2024
Quantization of Large Language Models with an Overdetermined Basis

Daniil Merkulov, Daria Cherniuk, Alexander Rudikov et al.

In this paper, we introduce an algorithm for data quantization based on the principles of Kashin representation. This approach hinges on decomposing any given vector, matrix, or tensor into two factors. The first factor maintains a small infinity norm, while the second exhibits a similarly constrained norm when multiplied by an orthogonal matrix. Surprisingly, the entries of factors after decomposition are well-concentrated around several peaks, which allows us to efficiently replace them with corresponding centroids for quantization purposes. We study the theoretical properties of the proposed approach and rigorously evaluate our compression algorithm in the context of next-word prediction tasks and on a set of downstream tasks for text classification. Our findings demonstrate that Kashin Quantization achieves competitive or superior quality in model performance while ensuring data compression, marking a significant advancement in the field of data quantization.

LGMar 3, 2025
Bayesian Inverse Problems Meet Flow Matching: Efficient and Flexible Inference via Transformers

Daniil Sherki, Ivan Oseledets, Ekaterina Muravleva

The efficient resolution of Bayesian inverse problems remains challenging due to the high computational cost of traditional sampling methods. In this paper, we propose a novel framework that integrates Conditional Flow Matching (CFM) with a transformer-based architecture to enable fast and flexible sampling from complex posterior distributions. The proposed methodology involves the direct learning of conditional probability trajectories from the data, leveraging CFM's ability to bypass iterative simulation and transformers' capacity to process arbitrary numbers of observations. The efficacy of the proposed framework is demonstrated through its application to three problems: a simple nonlinear model, a disease dynamics framework, and a two-dimensional Darcy flow Partial Differential Equation. The primary outcomes demonstrate that the relative errors in parameters recovery are as low as 1.5%, and that the inference time is reduced by up to 2000 times on CPU in comparison with the Monte Carlo Markov Chain. This framework facilitates the expeditious resolution of Bayesian problems through the utilisation of sampling from the learned conditional distribution.

LGFeb 3, 2025
Message-Passing GNNs Fail to Approximate Sparse Triangular Factorizations

Vladislav Trifonov, Ekaterina Muravleva, Ivan Oseledets

Graph Neural Networks (GNNs) have been proposed as a tool for learning sparse matrix preconditioners, which are key components in accelerating linear solvers. This position paper argues that message-passing GNNs are fundamentally incapable of approximating sparse triangular factorizations. We demonstrate that message-passing GNNs fundamentally fail to approximate sparse triangular factorizations for classes of matrices for which high-quality preconditioners exist but require non-local dependencies. To illustrate this, we construct a set of baselines using both synthetic matrices and real-world examples from the SuiteSparse collection. Across a range of GNN architectures, including Graph Attention Networks and Graph Transformers, we observe severe performance degradation compared to exact or K-optimal factorizations, with cosine similarity dropping below $0.6$ in key cases. Our theoretical and empirical results suggest that architectural innovations beyond message-passing are necessary for applying GNNs to scientific computing tasks such as matrix factorization. Experiments demonstrate that overcoming non-locality alone is insufficient. Tailored architectures are necessary to capture the required dependencies since even a completely non-local Graph Transformer fails to match the proposed baselines.

LGSep 27, 2025
Deep Learning for Subspace Regression

Vladimir Fanaskov, Vladislav Trifonov, Alexander Rudikov et al.

It is often possible to perform reduced order modelling by specifying linear subspace which accurately captures the dynamics of the system. This approach becomes especially appealing when linear subspace explicitly depends on parameters of the problem. A practical way to apply such a scheme is to compute subspaces for a selected set of parameters in the computationally demanding offline stage and in the online stage approximate subspace for unknown parameters by interpolation. For realistic problems the space of parameters is high dimensional, which renders classical interpolation strategies infeasible or unreliable. We propose to relax the interpolation problem to regression, introduce several loss functions suitable for subspace data, and use a neural network as an approximation to high-dimensional target function. To further simplify a learning problem we introduce redundancy: in place of predicting subspace of a given dimension we predict larger subspace. We show theoretically that this strategy decreases the complexity of the mapping for elliptic eigenproblems with constant coefficients and makes the mapping smoother for general smooth function on the Grassmann manifold. Empirical results also show that accuracy significantly improves when larger-than-needed subspaces are predicted. With the set of numerical illustrations we demonstrate that subspace regression can be useful for a range of tasks including parametric eigenproblems, deflation techniques, relaxation methods, optimal control and solution of parametric partial differential equations.

NASep 20, 2025
Spectral Analysis of the Weighted Frobenius Objective

Vladislav Trifonov, Ivan Oseledets, Ekaterina Muravleva

We analyze a weighted Frobenius loss for approximating symmetric positive definite matrices in the context of preconditioning iterative solvers. Unlike the standard Frobenius norm, the weighted loss penalizes error components associated with small eigenvalues of the system matrix more strongly. Our analysis reveals that each eigenmode is scaled by the corresponding square of its eigenvalue, and that, under a fixed error budget, the loss is minimized only when the error is confined to the direction of the largest eigenvalue. This provides a rigorous explanation of why minimizing the weighted loss naturally suppresses low-frequency components, which can be a desirable strategy for the conjugate gradient method. The analysis is independent of the specific approximation scheme or sparsity pattern, and applies equally to incomplete factorizations, algebraic updates, and learning-based constructions. Numerical experiments confirm the predictions of the theory, including an illustration where sparse factors are trained by a direct gradient updates to IC(0) factor entries, i.e., no trained neural network model is used.

CVJun 5, 2025
Geological Field Restoration through the Lens of Image Inpainting

Vladislav Trifonov, Ivan Oseledets, Ekaterina Muravleva

We present a new viewpoint on a reconstructing multidimensional geological fields from sparse observations. Drawing inspiration from deterministic image inpainting techniques, we model a partially observed spatial field as a multidimensional tensor and recover missing values by enforcing a global low-rank structure. Our approach combines ideas from tensor completion and geostatistics, providing a robust optimization framework. Experiments on synthetic geological fields demonstrate that used tensor completion method significant improvements in reconstruction accuracy over ordinary kriging for various percent of observed data.

LGJun 7, 2024
ConDiff: A Challenging Dataset for Neural Solvers of Partial Differential Equations

Vladislav Trifonov, Alexander Rudikov, Oleg Iliev et al.

We present ConDiff, a novel dataset for scientific machine learning. ConDiff focuses on the parametric diffusion equation with space dependent coefficients, a fundamental problem in many applications of partial differential equations (PDEs). The main novelty of the proposed dataset is that we consider discontinuous coefficients with high contrast. These coefficient functions are sampled from a selected set of distributions. This class of problems is not only of great academic interest, but is also the basis for describing various environmental and industrial problems. In this way, ConDiff shortens the gap with real-world problems while remaining fully synthetic and easy to use. ConDiff consists of a diverse set of diffusion equations with coefficients covering a wide range of contrast levels and heterogeneity with a measurable complexity metric for clearer comparison between different coefficient functions. We baseline ConDiff on standard deep learning models in the field of scientific machine learning. By providing a large number of problem instances, each with its own coefficient function and right-hand side, we hope to encourage the development of novel physics-based deep learning approaches, such as neural operators, ultimately driving progress towards more accurate and efficient solutions of complex PDE problems.

CVJan 29, 2019
Reconstruction of 3D Porous Media From 2D Slices

Denis Volkhonskiy, Ekaterina Muravleva, Oleg Sudakov et al.

In many branches of earth sciences, the problem of rock study on the micro-level arises. However, a significant number of representative samples is not always feasible. Thus the problem of the generation of samples with similar properties becomes actual. In this paper, we propose a novel deep learning architecture for three-dimensional porous media reconstruction from two-dimensional slices. We fit a distribution on all possible three-dimensional structures of a specific type based on the given dataset of samples. Then, given partial information (central slices), we recover the three-dimensional structure around such slices as the most probable one according to that constructed distribution. Technically, we implement this in the form of a deep neural network with encoder, generator and discriminator modules. Numerical experiments show that this method provides a good reconstruction in terms of Minkowski functionals.

LGJun 8, 2018
Data-driven model for the identification of the rock type at a drilling bit

Nikita Klyuchnikov, Alexey Zaytsev, Arseniy Gruzdev et al.

Directional oil well drilling requires high precision of the wellbore positioning inside the productive area. However, due to specifics of engineering design, sensors that explicitly determine the type of the drilled rock are located farther than 15m from the drilling bit. As a result, the target area runaways can be detected only after this distance, which in turn, leads to a loss in well productivity and the risk of the need for an expensive re-boring operation. We present a novel approach for identifying rock type at the drilling bit based on machine learning classification methods and data mining on sensors readings. We compare various machine-learning algorithms, examine extra features coming from mathematical modeling of drilling mechanics, and show that the real-time rock type classification error can be reduced from 13.5 % to 9 %. The approach is applicable for precise directional drilling in relatively thin target intervals of complex shapes and generalizes appropriately to new wells that are different from the ones used for training the machine learning model.