MLLGSTDec 28, 2022

Distribution Estimation of Contaminated Data via DNN-based MoM-GANs

arXiv:2212.13741v1h-index: 25
Originality Incremental advance
AI Analysis

This addresses robust distribution estimation for data with outliers, which is an incremental improvement in statistical machine learning.

The paper tackles distribution estimation for contaminated data by proposing MoM-GAN, a method combining GANs with median-of-mean estimation, and derives a non-asymptotic error bound that decreases as n^{-b/p} ∨ n^{-1/2}, with numerical results showing it outperforms other methods on real applications.

This paper studies the distribution estimation of contaminated data by the MoM-GAN method, which combines generative adversarial net (GAN) and median-of-mean (MoM) estimation. We use a deep neural network (DNN) with a ReLU activation function to model the generator and discriminator of the GAN. Theoretically, we derive a non-asymptotic error bound for the DNN-based MoM-GAN estimator measured by integral probability metrics with the $b$-smoothness Hölder class. The error bound decreases essentially as $n^{-b/p}\vee n^{-1/2}$, where $n$ and $p$ are the sample size and the dimension of input data. We give an algorithm for the MoM-GAN method and implement it through two real applications. The numerical results show that the MoM-GAN outperforms other competitive methods when dealing with contaminated data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes