COIN++: Neural Compression Across Modalities
This work addresses the need for a unified compression method across multiple data modalities, offering a novel approach that could benefit applications in multimedia, medical imaging, and climate science.
The paper tackles the problem of specialized neural compression architectures for different data modalities by proposing COIN++, a framework that uses implicit neural representations and meta-learned modulations to handle diverse data types, achieving large compression gains and reducing encoding time by two orders of magnitude compared to baselines.
Neural compression algorithms are typically based on autoencoders that require specialized encoder and decoder architectures for different data modalities. In this paper, we propose COIN++, a neural compression framework that seamlessly handles a wide range of data modalities. Our approach is based on converting data to implicit neural representations, i.e. neural functions that map coordinates (such as pixel locations) to features (such as RGB values). Then, instead of storing the weights of the implicit neural representation directly, we store modulations applied to a meta-learned base network as a compressed code for the data. We further quantize and entropy code these modulations, leading to large compression gains while reducing encoding time by two orders of magnitude compared to baselines. We empirically demonstrate the feasibility of our method by compressing various data modalities, from images and audio to medical and climate data.