An Efficient End-to-End 3D Voxel Reconstruction based on Neural Architecture Search
This work addresses the need for efficient and accurate 3D reconstruction in computer vision, though it appears incremental as it builds on neural architecture search and binary classification techniques.
The paper tackles the problem of inefficient 3D voxel reconstruction by proposing a method that uses neural architecture search to tailor network architectures per object, achieving higher accuracy with fewer parameters compared to existing SDF or binary classification networks.
Using neural networks to represent 3D objects has become popular. However, many previous works employ neural networks with fixed architecture and size to represent different 3D objects, which lead to excessive network parameters for simple objects and limited reconstruction accuracy for complex objects. For each 3D model, it is desirable to have an end-to-end neural network with as few parameters as possible to achieve high-fidelity reconstruction. In this paper, we propose an efficient voxel reconstruction method utilizing neural architecture search (NAS) and binary classification. Taking the number of layers, the number of nodes in each layer, and the activation function of each layer as the search space, a specific network architecture can be obtained based on reinforcement learning technology. Furthermore, to get rid of the traditional surface reconstruction algorithms (e.g., marching cube) used after network inference, we complete the end-to-end network by classifying binary voxels. Compared to other signed distance field (SDF) prediction or binary classification networks, our method achieves significantly higher reconstruction accuracy using fewer network parameters.