PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction
This enables interpretable and editable 3D reconstruction for computer vision applications, though it is incremental as it builds on existing shape program concepts.
The paper tackles 3D shape reconstruction from images by introducing PyTorchGeoNodes, a differentiable module that parses shape programs into PyTorch code for gradient-based optimization, achieving accurate reconstructions on the ScanNet dataset.
We propose PyTorchGeoNodes, a differentiable module for reconstructing 3D objects and their parameters from images using interpretable shape programs. Unlike traditional CAD model retrieval, shape programs allow reasoning about semantic parameters, editing, and a low memory footprint. Despite their potential, shape programs for 3D scene understanding have been largely overlooked. Our key contribution is enabling gradient-based optimization by parsing shape programs, or more precisely procedural models designed in Blender, into efficient PyTorch code. While there are many possible applications of our PyTochGeoNodes, we show that a combination of PyTorchGeoNodes with genetic algorithm is a method of choice to optimize both discrete and continuous shape program parameters for 3D reconstruction and understanding of 3D object parameters. Our modular framework can be further integrated with other reconstruction algorithms, and we demonstrate one such integration to enable procedural Gaussian splatting. Our experiments on the ScanNet dataset show that our method achieves accurate reconstructions while enabling, until now, unseen level of 3D scene understanding.