GOCor: Bringing Globally Optimized Correspondence Volumes into Your Neural Network
This addresses a key bottleneck in dense matching for computer vision applications, offering a novel solution that enhances accuracy in tasks like optical flow and semantic matching.
The paper tackles the problem of ambiguous dense correspondences in computer vision by replacing the standard feature correlation layer with GOCor, a differentiable module that optimizes correspondence volumes to account for similar regions, resulting in significant performance improvements in geometric matching, optical flow, and dense semantic matching tasks.
The feature correlation layer serves as a key neural network module in numerous computer vision problems that involve dense correspondences between image pairs. It predicts a correspondence volume by evaluating dense scalar products between feature vectors extracted from pairs of locations in two images. However, this point-to-point feature comparison is insufficient when disambiguating multiple similar regions in an image, severely affecting the performance of the end task. We propose GOCor, a fully differentiable dense matching module, acting as a direct replacement to the feature correlation layer. The correspondence volume generated by our module is the result of an internal optimization procedure that explicitly accounts for similar regions in the scene. Moreover, our approach is capable of effectively learning spatial matching priors to resolve further matching ambiguities. We analyze our GOCor module in extensive ablative experiments. When integrated into state-of-the-art networks, our approach significantly outperforms the feature correlation layer for the tasks of geometric matching, optical flow, and dense semantic matching. The code and trained models will be made available at github.com/PruneTruong/GOCor.