Min Yang

h-index12

5papers

1,096citations

Novelty52%

AI Score33

Ranked #115,842 of 194,257 authors (top 60%)#25,464 in LG (top 63%)

5 Papers

3.1LGJul 31, 2021Code

Missingness Augmentation: A General Approach for Improving Generative Imputation Models

Yufeng Wang, Dan Li, Cong Xu et al.

Missing data imputation is a fundamental problem in data analysis, and many studies have been conducted to improve its performance by exploring model structures and learning procedures. However, data augmentation, as a simple yet effective method, has not received enough attention in this area. In this paper, we propose a novel data augmentation method called Missingness Augmentation (MisA) for generative imputation models. Our approach dynamically produces incomplete samples at each epoch by utilizing the generator's output, constraining the augmented samples using a simple reconstruction loss, and combining this loss with the original loss to form the final optimization objective. As a general augmentation technique, MisA can be easily integrated into generative imputation frameworks, providing a simple yet effective way to enhance their performance. Experimental results demonstrate that MisA significantly improves the performance of many recently proposed generative imputation models on a variety of tabular and image datasets. The code is available at \url{https://github.com/WYu-Feng/Missingness-Augmentation}.

9.0LGNov 16, 2020Code

PC-GAIN: Pseudo-label Conditional Generative Adversarial Imputation Networks for Incomplete Data

Yufeng Wang, Dan Li, Xiang Li et al.

Datasets with missing values are very common in real world applications. GAIN, a recently proposed deep generative model for missing data imputation, has been proved to outperform many state-of-the-art methods. But GAIN only uses a reconstruction loss in the generator to minimize the imputation error of the non-missing part, ignoring the potential category information which can reflect the relationship between samples. In this paper, we propose a novel unsupervised missing data imputation method named PC-GAIN, which utilizes potential category information to further enhance the imputation power. Specifically, we first propose a pre-training procedure to learn potential category information contained in a subset of low-missing-rate data. Then an auxiliary classifier is determined using the synthetic pseudo-labels. Further, this classifier is incorporated into the generative adversarial framework to help the generator to yield higher quality imputation results. The proposed method can improve the imputation quality of GAIN significantly. Experimental results on various benchmark datasets show that our method is also superior to other baseline approaches. Our code is available at \url{https://github.com/WYu-Feng/pc-gain}.

31.3CLOct 13, 2020Code

BERT-EMD: Many-to-Many Layer Mapping for BERT Compression with Earth Mover's Distance

Jianquan Li, Xiaokang Liu, Honghong Zhao et al.

Pre-trained language models (e.g., BERT) have achieved significant success in various natural language processing (NLP) tasks. However, high storage and computational costs obstruct pre-trained language models to be effectively deployed on resource-constrained devices. In this paper, we propose a novel BERT distillation method based on many-to-many layer mapping, which allows each intermediate student layer to learn from any intermediate teacher layers. In this way, our model can learn from different teacher layers adaptively for various NLP tasks. %motivated by the intuition that different NLP tasks require different levels of linguistic knowledge contained in the intermediate layers of BERT. In addition, we leverage Earth Mover's Distance (EMD) to compute the minimum cumulative cost that must be paid to transform knowledge from teacher network to student network. EMD enables the effective matching for many-to-many layer mapping. %EMD can be applied to network layers with different sizes and effectively measures semantic distance between the teacher network and student network. Furthermore, we propose a cost attention mechanism to learn the layer weights used in EMD automatically, which is supposed to further improve the model's performance and accelerate convergence time. Extensive experiments on GLUE benchmark demonstrate that our model achieves competitive performance compared to strong competitors in terms of both accuracy and model compression.

2.3LGOct 8, 2020

Improve Adversarial Robustness via Weight Penalization on Classification Layer

Cong Xu, Dan Li, Min Yang

It is well-known that deep neural networks are vulnerable to adversarial attacks. Recent studies show that well-designed classification parts can lead to better robustness. However, there is still much space for improvement along this line. In this paper, we first prove that, from a geometric point of view, the robustness of a neural network is equivalent to some angular margin condition of the classifier weights. We then explain why ReLU type function is not a good choice for activation under this framework. These findings reveal the limitations of the existing approaches and lead us to develop a novel light-weight-penalized defensive method, which is simple and has a good scalability. Empirical results on multiple benchmark datasets demonstrate that our method can effectively improve the robustness of the network without requiring too much additional computation, while maintaining a high classification precision for clean data.

1.2MLDec 3, 2019

A Fast deflation Method for Sparse Principal Component Analysis via Subspace Projections

Cong Xu, Min Yang, Jin Zhang

The implementation of conventional sparse principal component analysis (SPCA) on high-dimensional data sets has become a time consuming work. In this paper, a series of subspace projections are constructed efficiently by using Household QR factorization. With the aid of these subspace projections, a fast deflation method, called SPCA-SP, is developed for SPCA. This method keeps a good tradeoff between various criteria, including sparsity, orthogonality, explained variance, balance of sparsity, and computational cost. Comparative experiments on the benchmark data sets confirm the effectiveness of the proposed method.