DPFlow: Adaptive Optical Flow Estimation with a Dual-Pyramid Framework
This addresses the lack of methods and benchmarks for high-resolution optical flow, which is essential for video processing tasks like restoration and action recognition, representing a novel method for a known bottleneck.
The paper tackles the problem of optical flow estimation for high-resolution videos up to 8K by proposing DPFlow, an adaptive architecture that generalizes from low-resolution training, and introduces Kubric-NK, a new benchmark for evaluation, achieving state-of-the-art results on multiple benchmarks.
Optical flow estimation is essential for video processing tasks, such as restoration and action recognition. The quality of videos is constantly increasing, with current standards reaching 8K resolution. However, optical flow methods are usually designed for low resolution and do not generalize to large inputs due to their rigid architectures. They adopt downscaling or input tiling to reduce the input size, causing a loss of details and global information. There is also a lack of optical flow benchmarks to judge the actual performance of existing methods on high-resolution samples. Previous works only conducted qualitative high-resolution evaluations on hand-picked samples. This paper fills this gap in optical flow estimation in two ways. We propose DPFlow, an adaptive optical flow architecture capable of generalizing up to 8K resolution inputs while trained with only low-resolution samples. We also introduce Kubric-NK, a new benchmark for evaluating optical flow methods with input resolutions ranging from 1K to 8K. Our high-resolution evaluation pushes the boundaries of existing methods and reveals new insights about their generalization capabilities. Extensive experimental results show that DPFlow achieves state-of-the-art results on the MPI-Sintel, KITTI 2015, Spring, and other high-resolution benchmarks.