Distribution Estimation of Contaminated Data via DNN-based MoM-GANs
This addresses robust distribution estimation for data with outliers, which is an incremental improvement in statistical machine learning.
The paper tackles distribution estimation for contaminated data by proposing MoM-GAN, a method combining GANs with median-of-mean estimation, and derives a non-asymptotic error bound that decreases as n^{-b/p} ∨ n^{-1/2}, with numerical results showing it outperforms other methods on real applications.
This paper studies the distribution estimation of contaminated data by the MoM-GAN method, which combines generative adversarial net (GAN) and median-of-mean (MoM) estimation. We use a deep neural network (DNN) with a ReLU activation function to model the generator and discriminator of the GAN. Theoretically, we derive a non-asymptotic error bound for the DNN-based MoM-GAN estimator measured by integral probability metrics with the $b$-smoothness Hölder class. The error bound decreases essentially as $n^{-b/p}\vee n^{-1/2}$, where $n$ and $p$ are the sample size and the dimension of input data. We give an algorithm for the MoM-GAN method and implement it through two real applications. The numerical results show that the MoM-GAN outperforms other competitive methods when dealing with contaminated data.