A Geometric Lens on Physics-Aligned Data Compression
For researchers in AI for Science, this provides a theoretical framework and diagnostic tool to understand and predict tradeoffs in physics-aligned compression, though the theory is domain-specific and incremental.
The paper develops a geometric theory explaining the tradeoff between preserving a target physical observable and standard reconstruction fidelity in physics-informed learned compression, showing that misaligned latent-space directions impose a fundamental limit on simultaneous preservation. Experiments across scientific domains validate the theory and a proposed alignment diagnostic.
In AI for Science, physics-informed losses are increasingly used to train learned compressors for scientific data, but their rate-distortion implications remain poorly understood. At fixed bitrate, these objectives often improve preservation of a target physical observable while degrading standard reconstruction fidelity. We develop a local geometric theory showing that this tradeoff is governed by the interaction of latent-space sensitivities induced by the entropy model, the physical observable, and the distortion metric. At each operating point, these induce preferred directions along which compression noise should be suppressed, yielding an anisotropic error-allocation mechanism. When these directions are misaligned, improving the observable at fixed rate necessarily worsens standard distortion, establishing a fundamental limit on simultaneous preservation. We formalise this through a local tangent-space rate-distortion law and introduce a practical alignment diagnostic based on dominant eigenspace overlap. Experiments across scientific domains test the theory and validate that the alignment diagnostic correlates with observed data- and physics-space trade-offs.