FLaTEC: Frequency-Disentangled Latent Triplanes for Efficient Compression of LiDAR Point Clouds
This work addresses efficient compression for LiDAR point clouds, which is crucial for applications like autonomous driving, and represents an incremental advancement with a novel method for a known bottleneck.
The paper tackled the problem of balancing compression ratio and reconstruction quality in LiDAR point cloud compression by proposing FLaTEC, a frequency-aware model that decouples low-frequency structures and high-frequency textures, achieving state-of-the-art rate-distortion performance with 78% and 94% BD-rate improvements on SemanticKITTI and Ford datasets.
Point cloud compression methods jointly optimize bitrates and reconstruction distortion. However, balancing compression ratio and reconstruction quality is difficult because low-frequency and high-frequency components contribute differently at the same resolution. To address this, we propose FLaTEC, a frequency-aware compression model that enables the compression of a full scan with high compression ratios. Our approach introduces a frequency-aware mechanism that decouples low-frequency structures and high-frequency textures, while hybridizing latent triplanes as a compact proxy for point cloud. Specifically, we convert voxelized embeddings into triplane representations to reduce sparsity, computational cost, and storage requirements. We then devise a frequency-disentangling technique that extracts compact low-frequency content while collecting high-frequency details across scales. The decoupled low-frequency and high-frequency components are stored in binary format. During decoding, full-spectrum signals are progressively recovered via a modulation block. Additionally, to compensate for the loss of 3D correlation, we introduce an efficient frequency-based attention mechanism that fosters local connectivity and outputs arbitrary resolution points. Our method achieves state-of-the-art rate-distortion performance and outperforms the standard codecs by 78\% and 94\% in BD-rate on both SemanticKITTI and Ford datasets.