LGCLApr 22, 2021

On Geodesic Distances and Contextual Embedding Compression for Text Classification

arXiv:2104.11295v1726 citations
Originality Incremental advance
AI Analysis

This addresses the need for smaller embeddings in IoT devices and data pipelines, offering a novel geometric approach for compression, though it is incremental as it builds on existing methods.

The paper tackled the problem of compressing contextual embeddings for memory-constrained settings by projecting BERT embeddings onto a manifold using Isomap and PCA, achieving compressed embeddings that performed within 0.1% of the original on a classification task despite a 12-fold dimensionality reduction.

In some memory-constrained settings like IoT devices and over-the-network data pipelines, it can be advantageous to have smaller contextual embeddings. We investigate the efficacy of projecting contextual embedding data (BERT) onto a manifold, and using nonlinear dimensionality reduction techniques to compress these embeddings. In particular, we propose a novel post-processing approach, applying a combination of Isomap and PCA. We find that the geodesic distance estimations, estimates of the shortest path on a Riemannian manifold, from Isomap's k-Nearest Neighbors graph bolstered the performance of the compressed embeddings to be comparable to the original BERT embeddings. On one dataset, we find that despite a 12-fold dimensionality reduction, the compressed embeddings performed within 0.1% of the original BERT embeddings on a downstream classification task. In addition, we find that this approach works particularly well on tasks reliant on syntactic data, when compared with linear dimensionality reduction. These results show promise for a novel geometric approach to achieve lower dimensional text embeddings from existing transformers and pave the way for data-specific and application-specific embedding compressions.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes