CVNov 14, 2022

Learning Latent Part-Whole Hierarchies for Point Clouds

arXiv:2211.07082v1h-index: 40
Originality Incremental advance
AI Analysis

This addresses the need for more expressive and interpretable 3D vision models, particularly for point cloud segmentation, though it is incremental as it builds on existing latent variable and segmentation methods.

The paper tackles the problem of explicitly modeling part-whole hierarchies in point clouds, which deep neural networks lack, by proposing a weakly supervised latent variable model that achieves state-of-the-art performance in both top-level part and middle-level latent subpart segmentation on the PartNet dataset.

Strong evidence suggests that humans perceive the 3D world by parsing visual scenes and objects into part-whole hierarchies. Although deep neural networks have the capability of learning powerful multi-level representations, they can not explicitly model part-whole hierarchies, which limits their expressiveness and interpretability in processing 3D vision data such as point clouds. To this end, we propose an encoder-decoder style latent variable model that explicitly learns the part-whole hierarchies for the multi-level point cloud segmentation. Specifically, the encoder takes a point cloud as input and predicts the per-point latent subpart distribution at the middle level. The decoder takes the latent variable and the feature from the encoder as an input and predicts the per-point part distribution at the top level. During training, only annotated part labels at the top level are provided, thus making the whole framework weakly supervised. We explore two kinds of approximated inference algorithms, i.e., most-probable-latent and Monte Carlo methods, and three stochastic gradient estimations for learning discrete latent variables, i.e., straight-through, REINFORCE, and pathwise estimators. Experimental results on the PartNet dataset show that the proposed method achieves state-of-the-art performance in not only top-level part segmentation but also middle-level latent subpart segmentation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes