CVLGSep 17, 2020

Learning a Deep Part-based Representation by Preserving Data Distribution

arXiv:2009.08246v1
Originality Incremental advance
AI Analysis

This work addresses high-dimensional data recognition problems, offering an incremental improvement over existing methods for preserving intrinsic data structures.

The paper tackles unsupervised dimensionality reduction by proposing a deep autoencoder network that preserves the data distribution to learn a part-based representation, achieving improved cluster accuracy and AMI on real-world datasets.

Unsupervised dimensionality reduction is one of the commonly used techniques in the field of high dimensional data recognition problems. The deep autoencoder network which constrains the weights to be non-negative, can learn a low dimensional part-based representation of data. On the other hand, the inherent structure of the each data cluster can be described by the distribution of the intraclass samples. Then one hopes to learn a new low dimensional representation which can preserve the intrinsic structure embedded in the original high dimensional data space perfectly. In this paper, by preserving the data distribution, a deep part-based representation can be learned, and the novel algorithm is called Distribution Preserving Network Embedding (DPNE). In DPNE, we first need to estimate the distribution of the original high dimensional data using the $k$-nearest neighbor kernel density estimation, and then we seek a part-based representation which respects the above distribution. The experimental results on the real-world data sets show that the proposed algorithm has good performance in terms of cluster accuracy and AMI. It turns out that the manifold structure in the raw data can be well preserved in the low dimensional feature space.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes