FusionNet and AugmentedFlowNet: Selective Proxy Ground Truth for Training on Unlabeled Images
This addresses the challenge of data scarcity in optical flow estimation for real-world applications, offering a method to leverage unlabeled data effectively.
The paper tackles the problem of training optical flow CNNs without large labeled datasets by introducing a selection mechanism to create proxy ground truth from multiple estimates, enabling training on unlabeled images. This approach improves network performance, achieving state-of-the-art results on the KITTI benchmarks.
Recent work has shown that convolutional neural networks (CNNs) can be used to estimate optical flow with high quality and fast runtime. This makes them preferable for real-world applications. However, such networks require very large training datasets. Engineering the training data is difficult and/or laborious. This paper shows how to augment a network trained on an existing synthetic dataset with large amounts of additional unlabelled data. In particular, we introduce a selection mechanism to assemble from multiple estimates a joint optical flow field, which outperforms that of all input methods. The latter can be used as proxy-ground-truth to train a network on real-world data and to adapt it to specific domains of interest. Our experimental results show that the performance of networks improves considerably, both, in cross-domain and in domain-specific scenarios. As a consequence, we obtain state-of-the-art results on the KITTI benchmarks.