CLLGMLJan 11, 2020

Embedding Compression with Isotropic Iterative Quantization

arXiv:2001.05314v217 citations
AI Analysis

This addresses memory constraints for NLP models on resource-limited platforms, representing an incremental improvement by adapting image retrieval techniques to embeddings.

The paper tackles the memory inefficiency of word embeddings in NLP by proposing an isotropic iterative quantization (IIQ) method to compress them into binary vectors, achieving over thirty-fold compression with comparable or improved performance on pre-trained embeddings like GloVe and HDC.

Continuous representation of words is a standard component in deep learning-based NLP models. However, representing a large vocabulary requires significant memory, which can cause problems, particularly on resource-constrained platforms. Therefore, in this paper we propose an isotropic iterative quantization (IIQ) approach for compressing embedding vectors into binary ones, leveraging the iterative quantization technique well established for image retrieval, while satisfying the desired isotropic property of PMI based models. Experiments with pre-trained embeddings (i.e., GloVe and HDC) demonstrate a more than thirty-fold compression ratio with comparable and sometimes even improved performance over the original real-valued embedding vectors.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes