Evgenii Egorov

LG
h-index1
11papers
121citations
Novelty53%
AI Score37

11 Papers

QUANT-PHApr 14, 2023
The END: An Equivariant Neural Decoder for Quantum Error Correction

Evgenii Egorov, Roberto Bondesan, Max Welling

Quantum error correction is a critical component for scaling up quantum computing. Given a quantum code, an optimal decoder maps the measured code violations to the most likely error that occurred, but its cost scales exponentially with the system size. Neural network decoders are an appealing solution since they can learn from data an efficient approximation to such a mapping and can automatically adapt to the noise distribution. In this work, we introduce a data efficient neural decoder that exploits the symmetries of the problem. We characterize the symmetries of the optimal decoder for the toric code and propose a novel equivariant architecture that achieves state of the art accuracy compared to previous neural decoders.

QUANT-PHSep 6, 2024
Equivariant Machine Learning Decoder for 3D Toric Codes

Oliver Weissl, Evgenii Egorov

Mitigating errors in computing and communication systems has seen a great deal of research since the beginning of the widespread use of these technologies. However, as we develop new methods to do computation or communication, we also need to reiterate the method used to deal with errors. Within the field of quantum computing, error correction is getting a lot of attention since errors can propagate fast and invalidate results, which makes the theoretical exponential speed increase in computation time, compared to traditional systems, obsolete. To correct errors in quantum systems, error-correcting codes are used. A subgroup of codes, topological codes, is currently the focus of many research papers. Topological codes represent parity check matrices corresponding to graphs embedded on a $d$-dimensional surface. For our research, the focus lies on the toric code with a 3D square lattice. The goal of any decoder is robustness to noise, which can increase with code size. However, a reasonable decoder performance scales polynomially with lattice size. As error correction is a time-sensitive operation, we propose a neural network using an inductive bias: equivariance. This allows the network to learn from a rather small subset of the exponentially growing training space of possible inputs. In addition, we investigate how transformer networks can help in correction. These methods will be compared with various configurations and previously published methods of decoding errors in the 3D toric code.

LGSep 24, 2025
Myosotis: structured computation for attention like layer

Evgenii Egorov, Hanno Ackermann, Markus Nagel et al.

Attention layers apply a sequence-to-sequence mapping whose parameters depend on the pairwise interactions of the input elements. However, without any structural assumptions, memory and compute scale quadratically with the sequence length. The two main ways to mitigate this are to introduce sparsity by ignoring a sufficient amount of pairwise interactions or to introduce recurrent dependence along them, as SSM does. Although both approaches are reasonable, they both have disadvantages. We propose a novel algorithm that combines the advantages of both concepts. Our idea is based on the efficient inversion of tree-structured matrices.

LGJun 4, 2024
Ai-Sampler: Adversarial Learning of Markov kernels with involutive maps

Evgenii Egorov, Ricardo Valperga, Efstratios Gavves

Markov chain Monte Carlo methods have become popular in statistics as versatile techniques to sample from complicated probability distributions. In this work, we propose a method to parameterize and train transition kernels of Markov chains to achieve efficient sampling and good mixing. This training procedure minimizes the total variation distance between the stationary distribution of the chain and the empirical distribution of the data. Our approach leverages involutive Metropolis-Hastings kernels constructed from reversible neural networks that ensure detailed balance by construction. We find that reversibility also implies $C_2$-equivariance of the discriminator function which can be used to restrict its function space.

LGJun 30, 2020
Involutive MCMC: a Unifying Framework

Kirill Neklyudov, Max Welling, Evgenii Egorov et al.

Markov Chain Monte Carlo (MCMC) is a computational approach to fundamental problems such as inference, integration, optimization, and simulation. The field has developed a broad spectrum of algorithms, varying in the way they are motivated, the way they are applied and how efficiently they sample. Despite all the differences, many of them share the same core principle, which we unify as the Involutive MCMC (iMCMC) framework. Building upon this, we describe a wide range of MCMC algorithms in terms of iMCMC, and formulate a number of "tricks" which one can use as design principles for developing new MCMC algorithms. Thus, iMCMC provides a unified view of many known MCMC algorithms, which facilitates the derivation of powerful extensions. We demonstrate the latter with two examples where we transform known reversible MCMC algorithms into more efficient irreversible ones.

IVMay 26, 2020
Bayesian Generative Models for Knowledge Transfer in MRI Semantic Segmentation Problems

Anna Kuzina, Evgenii Egorov, Evgeny Burnaev

Automatic segmentation methods based on deep learning have recently demonstrated state-of-the-art performance, outperforming the ordinary methods. Nevertheless, these methods are inapplicable for small datasets, which are very common in medical problems. To this end, we propose a knowledge transfer method between diseases via the Generative Bayesian Prior network. Our approach is compared to a pre-train approach and random initialization and obtains the best results in terms of Dice Similarity Coefficient metric for the small subsets of the Brain Tumor Segmentation 2018 database (BRATS2018).

LGAug 30, 2019
BooVAE: Boosting Approach for Continual Learning of VAE

Anna Kuzina, Evgenii Egorov, Evgeny Burnaev

Variational autoencoder (VAE) is a deep generative model for unsupervised learning, allowing to encode observations into the meaningful latent space. VAE is prone to catastrophic forgetting when tasks arrive sequentially, and only the data for the current one is available. We address this problem of continual learning for VAEs. It is known that the choice of the prior distribution over the latent space is crucial for VAE in the non-continual setting. We argue that it can also be helpful to avoid catastrophic forgetting. We learn the approximation of the aggregated posterior as a prior for each task. This approximation is parametrised as an additive mixture of distributions induced by encoder evaluated at trainable pseudo-inputs. We use a greedy boosting-like approach with entropy regularisation to learn the components. This method encourages components diversity, which is essential as we aim at memorising the current task with the fewest components possible. Based on the learnable prior, we introduce an end-to-end approach for continual learning of VAEs and provide empirical studies on commonly used benchmarks (MNIST, Fashion MNIST, NotMNIST) and CelebA datasets. For each dataset, the proposed method avoids catastrophic forgetting in a fully automatic way.

IVAug 15, 2019
Bayesian Generative Models for Knowledge Transfer in MRI Semantic Segmentation Problems

Anna Kuzina, Evgenii Egorov, Evgeny Burnaev

Automatic segmentation methods based on deep learning have recently demonstrated state-of-the-art performance, outperforming the ordinary methods. Nevertheless, these methods are inapplicable for small datasets, which are very common in medical problems. To this end, we propose a knowledge transfer method between diseases via the Generative Bayesian Prior network. Our approach is compared to a pre-train approach and random initialization and obtains the best results in terms of Dice Similarity Coefficient metric for the small subsets of the Brain Tumor Segmentation 2018 database (BRATS2018).

MLJun 9, 2019
The Implicit Metropolis-Hastings Algorithm

Kirill Neklyudov, Evgenii Egorov, Dmitry Vetrov

Recent works propose using the discriminator of a GAN to filter out unrealistic samples of the generator. We generalize these ideas by introducing the implicit Metropolis-Hastings algorithm. For any implicit probabilistic model and a target distribution represented by a set of samples, implicit Metropolis-Hastings operates by learning a discriminator to estimate the density-ratio and then generating a chain of samples. Since the approximation of density ratio introduces an error on every step of the chain, it is crucial to analyze the stationary distribution of such chain. For that purpose, we present a theoretical result stating that the discriminator loss upper bounds the total variation distance between the target distribution and the stationary distribution. Finally, we validate the proposed algorithm both for independent and Markov proposals on CIFAR-10 and CelebA datasets.

LGMay 20, 2019
MaxEntropy Pursuit Variational Inference

Evgenii Egorov, Kirill Neklydov, Ruslan Kostoev et al.

One of the core problems in variational inference is a choice of approximate posterior distribution. It is crucial to trade-off between efficient inference with simple families as mean-field models and accuracy of inference. We propose a variant of a greedy approximation of the posterior distribution with tractable base learners. Using Max-Entropy approach, we obtain a well-defined optimization problem. We demonstrate the ability of the method to capture complex multimodal posterior via continual learning setting for neural networks.

MLOct 16, 2018
Metropolis-Hastings view on variational inference and adversarial training

Kirill Neklyudov, Evgenii Egorov, Pavel Shvechikov et al.

A significant part of MCMC methods can be considered as the Metropolis-Hastings (MH) algorithm with different proposal distributions. From this point of view, the problem of constructing a sampler can be reduced to the question - how to choose a proposal for the MH algorithm? To address this question, we propose to learn an independent sampler that maximizes the acceptance rate of the MH algorithm, which, as we demonstrate, is highly related to the conventional variational inference. For Bayesian inference, the proposed method compares favorably against alternatives to sample from the posterior distribution. Under the same approach, we step beyond the scope of classical MCMC methods and deduce the Generative Adversarial Networks (GANs) framework from scratch, treating the generator as the proposal and the discriminator as the acceptance test. On real-world datasets, we improve Frechet Inception Distance and Inception Score, using different GANs as a proposal distribution for the MH algorithm. In particular, we demonstrate improvements of recently proposed BigGAN model on ImageNet.