Modality-Agnostic Variational Compression of Implicit Neural Representations
This provides a unified compression solution for multiple data types, addressing efficiency and versatility in storage and transmission, though it builds incrementally on existing INR techniques.
The paper tackles the problem of compressing diverse data modalities with a single algorithm by introducing VC-INR, a modality-agnostic neural compression method based on implicit neural representations, which outperforms established codecs like JPEG 2000 and MP3 across images, audio, and video.
We introduce a modality-agnostic neural compression algorithm based on a functional view of data and parameterised as an Implicit Neural Representation (INR). Bridging the gap between latent coding and sparsity, we obtain compact latent representations non-linearly mapped to a soft gating mechanism. This allows the specialisation of a shared INR network to each data item through subnetwork selection. After obtaining a dataset of such latent representations, we directly optimise the rate/distortion trade-off in a modality-agnostic space using neural compression. Variational Compression of Implicit Neural Representations (VC-INR) shows improved performance given the same representational capacity pre quantisation while also outperforming previous quantisation schemes used for other INR techniques. Our experiments demonstrate strong results over a large set of diverse modalities using the same algorithm without any modality-specific inductive biases. We show results on images, climate data, 3D shapes and scenes as well as audio and video, introducing VC-INR as the first INR-based method to outperform codecs as well-known and diverse as JPEG 2000, MP3 and AVC/HEVC on their respective modalities.