CVMay 25, 2022

Primitive3D: 3D Object Dataset Synthesis from Randomly Assembled Primitives

arXiv:2205.12627v16 citationsh-index: 40
Originality Highly original
AI Analysis

This provides a cost-effective solution for 3D computer vision researchers and practitioners facing data scarcity, though it is incremental in automating dataset creation.

The paper tackles the high cost of 3D object datasets by proposing a method to automatically generate large-scale 3D objects with annotations through random assembly of primitives, achieving state-of-the-art performance in 3D object classification and saving 86% of pretraining time with negligible degradation.

Numerous advancements in deep learning can be attributed to the access to large-scale and well-annotated datasets. However, such a dataset is prohibitively expensive in 3D computer vision due to the substantial collection cost. To alleviate this issue, we propose a cost-effective method for automatically generating a large amount of 3D objects with annotations. In particular, we synthesize objects simply by assembling multiple random primitives. These objects are thus auto-annotated with part labels originating from primitives. This allows us to perform multi-task learning by combining the supervised segmentation with unsupervised reconstruction. Considering the large overhead of learning on the generated dataset, we further propose a dataset distillation strategy to remove redundant samples regarding a target dataset. We conduct extensive experiments for the downstream tasks of 3D object classification. The results indicate that our dataset, together with multi-task pretraining on its annotations, achieves the best performance compared to other commonly used datasets. Further study suggests that our strategy can improve the model performance by pretraining and fine-tuning scheme, especially for the dataset with a small scale. In addition, pretraining with the proposed dataset distillation method can save 86\% of the pretraining time with negligible performance degradation. We expect that our attempt provides a new data-centric perspective for training 3D deep models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes