CVFeb 19, 2025

Point Cloud Geometry Scalable Coding Using a Resolution and Quality-conditioned Latents Probability Estimator

Daniele Mari, André F. R. Guarda, Nuno M. M. Rodrigues, Simone Milani, Fernando Pereira

arXiv:2502.14099v13.6h-index: 11IEEE Access

Originality Incremental advance

AI Analysis

This addresses the need for efficient multimedia consumption in heterogeneous scenarios, offering a scalable solution for deep learning-based point cloud coding, though it appears incremental as it builds on existing JPEG Pleno standards.

The paper tackles the problem of scalable coding for point clouds, enabling a single bitstream to serve multiple quality and resolution requirements, and shows that their SRQH method achieves this with only a limited rate-distortion penalty compared to non-scalable methods.

In the current age, users consume multimedia content in very heterogeneous scenarios in terms of network, hardware, and display capabilities. A naive solution to this problem is to encode multiple independent streams, each covering a different possible requirement for the clients, with an obvious negative impact in both storage and computational requirements. These drawbacks can be avoided by using codecs that enable scalability, i.e., the ability to generate a progressive bitstream, containing a base layer followed by multiple enhancement layers, that allow decoding the same bitstream serving multiple reconstructions and visualization specifications. While scalable coding is a well-known and addressed feature in conventional image and video codecs, this paper focuses on a new and very different problem, notably the development of scalable coding solutions for deep learning-based Point Cloud (PC) coding. The peculiarities of this 3D representation make it hard to implement flexible solutions that do not compromise the other functionalities of the codec. This paper proposes a joint quality and resolution scalability scheme, named Scalable Resolution and Quality Hyperprior (SRQH), that, contrary to previous solutions, can model the relationship between latents obtained with models trained for different RD tradeoffs and/or at different resolutions. Experimental results obtained by integrating SRQH in the emerging JPEG Pleno learning-based PC coding standard show that SRQH allows decoding the PC at different qualities and resolutions with a single bitstream while incurring only in a limited RD penalty and increment in complexity w.r.t. non-scalable JPEG PCC that would require one bitstream per coding configuration.

View on arXiv PDF

Similar