MLLGJan 30, 2020

NCVis: Noise Contrastive Approach for Scalable Visualization

arXiv:2001.11411v122 citations
AI Analysis

This work addresses scalability issues in data visualization for researchers and practitioners handling large datasets, though it is incremental as it builds on existing noise contrastive estimation techniques.

The authors tackled the performance limitations of existing dimensionality reduction methods like t-SNE for large, high-dimensional datasets by proposing NCVis, a method based on noise contrastive estimation, which processes over 1 million news headlines in minutes while maintaining representation quality comparable to classical methods.

Modern methods for data visualization via dimensionality reduction, such as t-SNE, usually have performance issues that prohibit their application to large amounts of high-dimensional data. In this work, we propose NCVis -- a high-performance dimensionality reduction method built on a sound statistical basis of noise contrastive estimation. We show that NCVis outperforms state-of-the-art techniques in terms of speed while preserving the representation quality of other methods. In particular, the proposed approach successfully proceeds a large dataset of more than 1 million news headlines in several minutes and presents the underlying structure in a human-readable way. Moreover, it provides results consistent with classical methods like t-SNE on more straightforward datasets like images of hand-written digits. We believe that the broader usage of such software can significantly simplify the large-scale data analysis and lower the entry barrier to this area.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes