Scalable Human-Machine Point Cloud Compression
This addresses the challenge of limited computational resources on edge devices for point cloud processing, though it is incremental as it builds on existing methods like PointNet++.
The paper tackles the problem of compressing point cloud data for efficient deep learning inference on edge devices by introducing a scalable codec specialized for machine classification tasks, while also supporting human viewing, and demonstrates significant improvements over prior non-specialized codecs on the ModelNet40 dataset.
Due to the limited computational capabilities of edge devices, deep learning inference can be quite expensive. One remedy is to compress and transmit point cloud data over the network for server-side processing. Unfortunately, this approach can be sensitive to network factors, including available bitrate. Luckily, the bitrate requirements can be reduced without sacrificing inference accuracy by using a machine task-specialized codec. In this paper, we present a scalable codec for point-cloud data that is specialized for the machine task of classification, while also providing a mechanism for human viewing. In the proposed scalable codec, the "base" bitstream supports the machine task, and an "enhancement" bitstream may be used for better input reconstruction performance for human viewing. We base our architecture on PointNet++, and test its efficacy on the ModelNet40 dataset. We show significant improvements over prior non-specialized codecs.