LC-NAS: Latency Constrained Neural Architecture Search for Point Cloud Networks
This work addresses the need for efficient point cloud networks in applications like self-driving cars and robotics, offering a novel latency-constrained NAS approach that is incremental over prior NAS methods by incorporating hardware-aware constraints.
The authors tackled the problem of designing point cloud neural architectures that meet specific latency constraints for time-critical applications, introducing LC-NAS, a framework that searches for architectures under a target latency. The result includes state-of-the-art performance on ModelNet40 classification with minimal computational cost and a 10x latency reduction on PartNet segmentation while maintaining high accuracy.
Point cloud architecture design has become a crucial problem for 3D deep learning. Several efforts exist to manually design architectures with high accuracy in point cloud tasks such as classification, segmentation, and detection. Recent progress in automatic Neural Architecture Search (NAS) minimizes the human effort in network design and optimizes high performing architectures. However, these efforts fail to consider important factors such as latency during inference. Latency is of high importance in time critical applications like self-driving cars, robot navigation, and mobile applications, that are generally bound by the available hardware. In this paper, we introduce a new NAS framework, dubbed LC-NAS, where we search for point cloud architectures that are constrained to a target latency. We implement a novel latency constraint formulation to trade-off between accuracy and latency in our architecture search. Contrary to previous works, our latency loss guarantees that the final network achieves latency under a specified target value. This is crucial when the end task is to be deployed in a limited hardware setting. Extensive experiments show that LC-NAS is able to find state-of-the-art architectures for point cloud classification in ModelNet40 with minimal computational cost. We also show how our searched architectures achieve any desired latency with a reasonably low drop in accuracy. Finally, we show how our searched architectures easily transfer to a different task, part segmentation on PartNet, where we achieve state-of-the-art results while lowering latency by a factor of 10.