FlowNAS: Neural Architecture Search for Optical Flow Estimation
This work addresses the problem of inefficient encoder design for optical flow estimation in computer vision, offering a domain-specific improvement over handcrafted models.
The paper tackled the sub-optimal use of image classification architectures for optical flow estimation by proposing FlowNAS, a neural architecture search method that automatically finds better encoder architectures, resulting in a 4.67% F1-all error on KITTI, an 8.4% reduction from the RAFT baseline, while reducing model complexity and latency.
Existing optical flow estimators usually employ the network architectures typically designed for image classification as the encoder to extract per-pixel features. However, due to the natural difference between the tasks, the architectures designed for image classification may be sub-optimal for flow estimation. To address this issue, we propose a neural architecture search method named FlowNAS to automatically find the better encoder architecture for flow estimation task. We first design a suitable search space including various convolutional operators and construct a weight-sharing super-network for efficiently evaluating the candidate architectures. Then, for better training the super-network, we propose Feature Alignment Distillation, which utilizes a well-trained flow estimator to guide the training of super-network. Finally, a resource-constrained evolutionary algorithm is exploited to find an optimal architecture (i.e., sub-network). Experimental results show that the discovered architecture with the weights inherited from the super-network achieves 4.67\% F1-all error on KITTI, an 8.4\% reduction of RAFT baseline, surpassing state-of-the-art handcrafted models GMA and AGFlow, while reducing the model complexity and latency. The source code and trained models will be released in https://github.com/VDIGPKU/FlowNAS.