Deep Unrolling of Sparsity-Induced RDO for 3D Point Cloud Attribute Coding
This work addresses efficient compression of 3D point cloud attributes for applications like virtual reality or autonomous driving, representing an incremental advancement in multi-resolution coding methods.
The paper tackles lossy attribute compression for 3D point clouds by projecting attributes onto nested B-spline subspaces and optimizing coefficients using a differentiable, sparsity-induced rate-distortion approach, achieving improved compression efficiency as indicated by optimized coefficients and data-driven prediction adjustments.
Given encoded 3D point cloud geometry available at the decoder, we study the problem of lossy attribute compression in a multi-resolution B-spline projection framework. A target continuous 3D attribute function is first projected onto a sequence of nested subspaces $\mathcal{F}^{(p)}_{l_0} \subseteq \cdots \subseteq \mathcal{F}^{(p)}_{L}$, where $\mathcal{F}^{(p)}_{l}$ is a family of functions spanned by a B-spline basis function of order $p$ at a chosen scale and its integer shifts. The projected low-pass coefficients $F_l^*$ are computed by variable-complexity unrolling of a rate-distortion (RD) optimization algorithm into a feed-forward network, where the rate term is the sparsity-promoting $\ell_1$-norm. Thus, the projection operation is end-to-end differentiable. For a chosen coarse-to-fine predictor, the coefficients are then adjusted to account for the prediction from a lower-resolution to a higher-resolution, which is also optimized in a data-driven manner.