Maksim Zhdanov

LG
h-index14
20papers
91citations
Novelty53%
AI Score57

20 Papers

LGOct 4, 2022
Amortized Bayesian Inference of GISAXS Data with Normalizing Flows

Maksim Zhdanov, Lisa Randolph, Thomas Kluge et al.

Grazing-Incidence Small-Angle X-ray Scattering (GISAXS) is a modern imaging technique used in material research to study nanoscale materials. Reconstruction of the parameters of an imaged object imposes an ill-posed inverse problem that is further complicated when only an in-plane GISAXS signal is available. Traditionally used inference algorithms such as Approximate Bayesian Computation (ABC) rely on computationally expensive scattering simulation software, rendering analysis highly time-consuming. We propose a simulation-based framework that combines variational auto-encoders and normalizing flows to estimate the posterior distribution of object parameters given its GISAXS data. We apply the inference pipeline to experimental data and demonstrate that our method reduces the inference cost by orders of magnitude while producing consistent results with ABC.

NCJun 4, 2022
Investigating Brain Connectivity with Graph Neural Networks and GNNExplainer

Maksim Zhdanov, Saskia Steinmann, Nico Hoffmann

Functional connectivity plays an essential role in modern neuroscience. The modality sheds light on the brain's functional and structural aspects, including mechanisms behind multiple pathologies. One such pathology is schizophrenia which is often followed by auditory verbal hallucinations. The latter is commonly studied by observing functional connectivity during speech processing. In this work, we have made a step toward an in-depth examination of functional connectivity during a dichotic listening task via deep learning for three groups of people: schizophrenia patients with and without auditory verbal hallucinations and healthy controls. We propose a graph neural network-based framework within which we represent EEG data as signals in the graph domain. The framework allows one to 1) predict a brain mental disorder based on EEG recording, 2) differentiate the listening state from the resting state for each group and 3) recognize characteristic task-depending connectivity. Experimental results show that the proposed model can differentiate between the above groups with state-of-the-art performance. Besides, it provides a researcher with meaningful information regarding each group's functional connectivity, which we validated on the current domain knowledge.

MTRL-SCIApr 28, 2023Code
Machine learning-assisted close-set X-ray diffraction phase identification of transition metals

Maksim Zhdanov, Andrey Zhdanov

Machine learning has been applied to the problem of X-ray diffraction phase prediction with promising results. In this paper, we describe a method for using machine learning to predict crystal structure phases from X-ray diffraction data of transition metals and their oxides. We evaluate the performance of our method and compare the variety of its settings. Our results demonstrate that the proposed machine learning framework achieves competitive performance. This demonstrates the potential for machine learning to significantly impact the field of X-ray diffraction and crystal structure determination. Open-source implementation: https://github.com/maxnygma/NeuralXRD.

CEMay 19
Uncertainty-aware Machine Learning Interatomic Potentials via Learned Functional Perturbations

Olga Zaghen, Maksim Zhdanov, Dario Coscia et al.

Machine Learning Interatomic Potentials (MLIPs) achieve near ab initio accuracy at a fraction of the cost of quantum-mechanical simulations, yet they remain prone to silent failures on out-of-distribution configurations, making principled uncertainty quantification (UQ) essential for error-aware simulations and active learning. Existing non-ensemble UQ methods for MLIPs rely either on variational inference or on parametric distributional assumptions, both of which add architectural complexity and hyper-parameters that must be tuned per task. Inspired by recent advances in probabilistic weather forecasting, we propose a simpler alternative: turn a deterministic MLIP into a probabilistic one through learned functional perturbations and finetune it end-to-end with the Continuous Ranked Probability Score (CRPS), a proper scoring rule. We validate the approach with an equivariant GNN (P-EGNN) trained from scratch and by finetuning the foundation model the Orb-v3 for silica. On the N-body charged particle benchmark, P-EGNN improves CRPS over the state-of-the-art Bayesian MLIP method BLIP by 19-32% across all training sizes; on silica, P-Orb raises the Spearman correlation between predicted uncertainty and actual error from 0.75 (BLIP-Orb) to 0.84.

LGApr 6
(Sparse) Attention to the Details: Preserving Spectral Fidelity in ML-based Weather Forecasting Models

Maksim Zhdanov, Ana Lucic, Max Welling et al.

We introduce Mosaic, a probabilistic weather forecasting model that addresses two principal sources of spectral degradation in ML-based weather prediction: (1) deterministic training against ensemble means and (2) compressive encoding creating an information bottleneck. Mosaic generates ensemble members through learned functional perturbations and operates on native-resolution grids via block-sparse attention, a hardware-aligned mechanism that captures long-range dependencies at linear cost by sharing keys and values across spatially adjacent queries. At 1.5$°$ resolution with 214M parameters, Mosaic matches or outperforms models trained on 6 times finer data on headline upper-air variables and achieves state-of-the-art results among 1.5$°$ models, producing well-calibrated ensembles whose individual members exhibit near-perfect spectral alignment across all resolved frequencies. A 24-member, 10-day forecast takes under 12 seconds on a single H100 GPU.

LGJun 4, 2022
Learning Generative Factors of EEG Data with Variational auto-encoders

Maksim Zhdanov, Saskia Steinmann, Nico Hoffmann

Electroencephalography produces high-dimensional, stochastic data from which it might be challenging to extract high-level knowledge about the phenomena of interest. We address this challenge by applying the framework of variational auto-encoders to 1) classify multiple pathologies and 2) recover the neurological mechanisms of those pathologies in a data-driven manner. Our framework learns generative factors of data related to pathologies. We provide an algorithm to decode those factors further and discover how different pathologies affect observed data. We illustrate the applicability of the proposed approach to identifying schizophrenia, either followed or not by auditory verbal hallucinations. We further demonstrate the ability of the framework to learn disease-related mechanisms consistent with current domain knowledge. We also compare the proposed framework with several benchmark approaches and indicate its classification performance and interpretability advantages.

LGApr 6
Vintix II: Decision Pre-Trained Transformer is a Scalable In-Context Reinforcement Learner

Andrei Polubarov, Lyubaykin Nikita, Alexander Derevyagin et al.

Recent progress in in-context reinforcement learning (ICRL) has demonstrated its potential for training generalist agents that can acquire new tasks directly at inference. Algorithm Distillation (AD) pioneered this paradigm and was subsequently scaled to multi-domain settings, although its ability to generalize to unseen tasks remained limited. The Decision Pre-Trained Transformer (DPT) was introduced as an alternative, showing stronger in-context reinforcement learning abilities in simplified domains, but its scalability had not been established. In this work, we extend DPT to diverse multi-domain environments, applying Flow Matching as a natural training choice that preserves its interpretation as Bayesian posterior sampling. As a result, we obtain an agent trained across hundreds of diverse tasks that achieves clear gains in generalization to the held-out test set. This agent improves upon prior AD scaling and demonstrates stronger performance in both online and offline inference, reinforcing ICRL as a viable alternative to expert distillation for training generalist agents.

LGDec 12, 2022
Implicit Convolutional Kernels for Steerable CNNs

Maksim Zhdanov, Nico Hoffmann, Gabriele Cesa

Steerable convolutional neural networks (CNNs) provide a general framework for building neural networks equivariant to translations and transformations of an origin-preserving group $G$, such as reflections and rotations. They rely on standard convolutions with $G$-steerable kernels obtained by analytically solving the group-specific equivariance constraint imposed onto the kernel space. As the solution is tailored to a particular group $G$, implementing a kernel basis does not generalize to other symmetry transformations, complicating the development of general group equivariant models. We propose using implicit neural representation via multi-layer perceptrons (MLPs) to parameterize $G$-steerable kernels. The resulting framework offers a simple and flexible way to implement Steerable CNNs and generalizes to any group $G$ for which a $G$-equivariant MLP can be built. We prove the effectiveness of our method on multiple tasks, including N-body simulations, point cloud classification and molecular property prediction.

LGDec 1, 2025
MSPT: Efficient Large-Scale Physical Modeling via Parallelized Multi-Scale Attention

Pedro M. P. Curvo, Jan-Willem van de Meent, Maksim Zhdanov

A key scalability challenge in neural solvers for industrial-scale physics simulations is efficiently capturing both fine-grained local interactions and long-range global dependencies across millions of spatial elements. We introduce the Multi-Scale Patch Transformer (MSPT), an architecture that combines local point attention within patches with global attention to coarse patch-level representations. To partition the input domain into spatially-coherent patches, we employ ball trees, which handle irregular geometries efficiently. This dual-scale design enables MSPT to scale to millions of points on a single GPU. We validate our method on standard PDE benchmarks (elasticity, plasticity, fluid dynamics, porous flow) and large-scale aerodynamic datasets (ShapeNet-Car, Ahmed-ML), achieving state-of-the-art accuracy with substantially lower memory footprint and computational cost.

LGMay 11
Follow the Mean: Reference-Guided Flow Matching

Pedro M. P. Curvo, Maksim Zhdanov, Floor Eijkelboom et al.

Existing approaches to controllable generation typically rely on fine-tuning, auxiliary networks, or test-time search. We show that flow matching admits a different control interface: adaptation through examples. For deterministic interpolants, the velocity field is solely governed by a conditional endpoint mean; shifting this mean shifts the flow itself. This yields a simple principle for controllable generation: steer a pretrained model by changing the reference set it follows. We instantiate this idea in two forms. Reference-Mean Guidance is training-free: it computes a closed-form endpoint-mean correction from a reference bank and applies it to a frozen FLUX.2-klein (4B) model, enabling control of color, identity, style, and structure while keeping the prompt, seed, and weights fixed. Semi-Parametric Guidance amortizes the same idea through an explicit mean anchor and learned residual refiner, matching unconditional DiT-B/4 quality on AFHQv2 while allowing the reference set to be swapped at inference time. These results point to a broader direction: generative models that adapt through data, not parameter updates.

LGJun 14, 2025Code
BSA: Ball Sparse Attention for Large-scale Geometries

Catalin E. Brita, Hieu Nguyen, Lohithsai Yadala Chanchu et al.

Self-attention scales quadratically with input size, limiting its use for large-scale physical systems. Although sparse attention mechanisms provide a viable alternative, they are primarily designed for regular structures such as text or images, making them inapplicable for irregular geometries. In this work, we present Ball Sparse Attention (BSA), which adapts Native Sparse Attention (NSA) (Yuan et al., 2025) to unordered point sets by imposing regularity using the Ball Tree structure from the Erwin Transformer (Zhdanov et al., 2025). We modify NSA's components to work with ball-based neighborhoods, yielding a global receptive field at sub-quadratic cost. On an airflow pressure prediction task, we achieve accuracy comparable to Full Attention while significantly reducing the theoretical computational complexity. Our implementation is available at https://github.com/britacatalin/bsa.

LGMay 20, 2025Code
Electrostatics from Laplacian Eigenbasis for Neural Network Interatomic Potentials

Maksim Zhdanov, Vladislav Kurenkov

In this work, we introduce Phi-Module, a universal plugin module that enforces Poisson's equation within the message-passing framework to learn electrostatic interactions in a self-supervised manner. Specifically, each atom-wise representation is encouraged to satisfy a discretized Poisson's equation, making it possible to acquire a potential φ and corresponding charges \r{ho} linked to the learnable Laplacian eigenbasis coefficients of a given molecular graph. We then derive an electrostatic energy term, crucial for improved total energy predictions. This approach integrates seamlessly into any existing neural potential with insignificant computational overhead. Our results underscore how embedding a first-principles constraint in neural interatomic potentials can significantly improve performance while remaining hyperparameter-friendly, memory-efficient, and lightweight in training. Code will be available at https://github.com/dunnolab/phi-module.

LGFeb 22, 2024
Clifford-Steerable Convolutional Neural Networks

Maksim Zhdanov, David Ruhe, Maurice Weiler et al.

We present Clifford-Steerable Convolutional Neural Networks (CS-CNNs), a novel class of $\mathrm{E}(p, q)$-equivariant CNNs. CS-CNNs process multivector fields on pseudo-Euclidean spaces $\mathbb{R}^{p,q}$. They cover, for instance, $\mathrm{E}(3)$-equivariance on $\mathbb{R}^3$ and Poincaré-equivariance on Minkowski spacetime $\mathbb{R}^{1,3}$. Our approach is based on an implicit parametrization of $\mathrm{O}(p,q)$-steerable kernels via Clifford group equivariant neural networks. We significantly and consistently outperform baseline methods on fluid dynamics as well as relativistic electrodynamics forecasting tasks.

LGJun 23, 2023
Catching Image Retrieval Generalization

Maksim Zhdanov, Ivan Karpukhin

The concepts of overfitting and generalization are vital for evaluating machine learning models. In this work, we show that the popular Recall@K metric depends on the number of classes in the dataset, which limits its ability to estimate generalization. To fix this issue, we propose a new metric, which measures retrieval performance, and, unlike Recall@K, estimates generalization. We apply the proposed metric to popular image retrieval methods and provide new insights about deep metric learning generalization.

LGFeb 24, 2025
Erwin: A Tree-based Hierarchical Transformer for Large-scale Physical Systems

Maksim Zhdanov, Max Welling, Jan-Willem van de Meent

Large-scale physical systems defined on irregular grids pose significant scalability challenges for deep learning methods, especially in the presence of long-range interactions and multi-scale coupling. Traditional approaches that compute all pairwise interactions, such as attention, become computationally prohibitive as they scale quadratically with the number of nodes. We present Erwin, a hierarchical transformer inspired by methods from computational many-body physics, which combines the efficiency of tree-based algorithms with the expressivity of attention mechanisms. Erwin employs ball tree partitioning to organize computation, which enables linear-time attention by processing nodes in parallel within local neighborhoods of fixed size. Through progressive coarsening and refinement of the ball tree structure, complemented by a novel cross-ball interaction mechanism, it captures both fine-grained local details and global features. We demonstrate Erwin's effectiveness across multiple domains, including cosmology, molecular dynamics, PDE solving, and particle fluid dynamics, where it consistently outperforms baseline methods both in accuracy and computational efficiency.

LGMay 19, 2025
AdS-GNN -- a Conformally Equivariant Graph Neural Network

Maksim Zhdanov, Nabil Iqbal, Erik Bekkers et al.

Conformal symmetries, i.e.\ coordinate transformations that preserve angles, play a key role in many fields, including physics, mathematics, computer vision and (geometric) machine learning. Here we build a neural network that is equivariant under general conformal transformations. To achieve this, we lift data from flat Euclidean space to Anti de Sitter (AdS) space. This allows us to exploit a known correspondence between conformal transformations of flat space and isometric transformations on the AdS space. We then build upon the fact that such isometric transformations have been extensively studied on general geometries in the geometric deep learning literature. We employ message-passing layers conditioned on the proper distance, yielding a computationally efficient framework. We validate our model on tasks from computer vision and statistical physics, demonstrating strong performance, improved generalization capacities, and the ability to extract conformal data such as scaling dimensions from the trained network.

LGNov 23, 2025
Adaptive Mesh-Quantization for Neural PDE Solvers

Winfried van den Dool, Maksim Zhdanov, Yuki M. Asano et al.

Physical systems commonly exhibit spatially varying complexity, presenting a significant challenge for neural PDE solvers. While Graph Neural Networks can handle the irregular meshes required for complex geometries and boundary conditions, they still apply uniform computational effort across all nodes regardless of the underlying physics complexity. This leads to inefficient resource allocation where computationally simple regions receive the same treatment as complex phenomena. We address this challenge by introducing Adaptive Mesh Quantization: spatially adaptive quantization across mesh node, edge, and cluster features, dynamically adjusting the bit-width used by a quantized model. We propose an adaptive bit-width allocation strategy driven by a lightweight auxiliary model that identifies high-loss regions in the input mesh. This enables dynamic resource distribution in the main model, where regions of higher difficulty are allocated increased bit-width, optimizing computational resource utilization. We demonstrate our framework's effectiveness by integrating it with two state-of-the-art models, MP-PDE and GraphViT, to evaluate performance across multiple tasks: 2D Darcy flow, large-scale unsteady fluid dynamics in 2D, steady-state Navier-Stokes simulations in 3D, and a 2D hyper-elasticity problem. Our framework demonstrates consistent Pareto improvements over uniformly quantized baselines, yielding up to 50% improvements in performance at the same cost.

LGOct 15, 2025
Conditional Clifford-Steerable CNNs with Complete Kernel Basis for PDE Modeling

Bálint László Szarvas, Maksim Zhdanov

Clifford-Steerable CNNs (CSCNNs) provide a unified framework that allows incorporating equivariance to arbitrary pseudo-Euclidean groups, including isometries of Euclidean space and Minkowski spacetime. In this work, we demonstrate that the kernel basis of CSCNNs is not complete, thus limiting the model expressivity. To address this issue, we propose Conditional Clifford-Steerable Kernels, which augment the kernels with equivariant representations computed from the input feature field. We derive the equivariance constraint for these input-dependent kernels and show how it can be solved efficiently via implicit parameterization. We empirically demonstrate an improved expressivity of the resulting framework on multiple PDE forecasting tasks, including fluid dynamics and relativistic electrodynamics, where our method consistently outperforms baseline methods.

LGDec 16, 2023
Identity Curvature Laplace Approximation for Improved Out-of-Distribution Detection

Maksim Zhdanov, Stanislav Dereka, Sergey Kolesnikov

Uncertainty estimation is crucial in safety-critical applications, where robust out-of-distribution (OOD) detection is essential. Traditional Bayesian methods, though effective, are often hindered by high computational demands. As an alternative, Laplace approximation offers a more practical and efficient approach to uncertainty estimation. In this paper, we introduce the Identity Curvature Laplace Approximation (ICLA), a novel method that challenges the conventional posterior covariance formulation by using identity curvature and optimizing prior precision. This innovative design significantly enhances OOD detection performance on well-known datasets such as CIFAR-10, CIFAR-100, and ImageNet, while maintaining calibration scores. We attribute this improvement to the alignment issues between typical feature embeddings and curvature as measured by the Fisher information matrix. Our findings are further supported by demonstrating that incorporating Fisher penalty or sharpness-aware minimization techniques can greatly enhance the uncertainty estimation capabilities of standard Laplace approximation.

CVMay 19, 2023
Diversifying Deep Ensembles: A Saliency Map Approach for Enhanced OOD Detection, Calibration, and Accuracy

Stanislav Dereka, Ivan Karpukhin, Maksim Zhdanov et al.

Deep ensembles are capable of achieving state-of-the-art results in classification and out-of-distribution (OOD) detection. However, their effectiveness is limited due to the homogeneity of learned patterns within ensembles. To overcome this issue, our study introduces Saliency Diversified Deep Ensemble (SDDE), a novel approach that promotes diversity among ensemble members by leveraging saliency maps. Through incorporating saliency map diversification, our method outperforms conventional ensemble techniques and improves calibration in multiple classification and OOD detection tasks. In particular, the proposed method achieves state-of-the-art OOD detection quality, calibration, and accuracy on multiple benchmarks, including CIFAR10/100 and large-scale ImageNet datasets.