LGMay 18, 2023

Information-Ordered Bottlenecks for Adaptive Semantic Compression

arXiv:2305.11213v16 citations
Originality Incremental advance
AI Analysis

This provides a method for adaptive compression and exploratory analysis in machine learning, though it appears incremental as it unifies previous approaches.

The paper tackles the problem of adaptive semantic compression by introducing the information-ordered bottleneck (IOB), a neural layer that compresses data into ordered latent variables without retraining, achieving near-optimal compression and meaningful semantic ordering for image and text data.

We present the information-ordered bottleneck (IOB), a neural layer designed to adaptively compress data into latent variables ordered by likelihood maximization. Without retraining, IOB nodes can be truncated at any bottleneck width, capturing the most crucial information in the first latent variables. Unifying several previous approaches, we show that IOBs achieve near-optimal compression for a given encoding architecture and can assign ordering to latent signals in a manner that is semantically meaningful. IOBs demonstrate a remarkable ability to compress embeddings of image and text data, leveraging the performance of SOTA architectures such as CNNs, transformers, and diffusion models. Moreover, we introduce a novel theory for estimating global intrinsic dimensionality with IOBs and show that they recover SOTA dimensionality estimates for complex synthetic data. Furthermore, we showcase the utility of these models for exploratory analysis through applications on heterogeneous datasets, enabling computer-aided discovery of dataset complexity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes