Leveraging AV1 motion vectors for Fast and Dense Feature Matching
This provides a resource-efficient front end for structure-from-motion pipelines, though it is incremental as it adapts existing compression techniques.
The paper tackled the problem of dense feature matching in videos by repurposing AV1 motion vectors to generate correspondences, achieving comparable performance to sequential SIFT with less CPU usage and reconstructing 0.46-0.62 million points at 0.51-0.53 pixel reprojection error in a demo.
We repurpose AV1 motion vectors to produce dense sub-pixel correspondences and short tracks filtered by cosine consistency. On short videos, this compressed-domain front end runs comparably to sequential SIFT while using far less CPU, and yields denser matches with competitive pairwise geometry. As a small SfM demo on a 117-frame clip, MV matches register all images and reconstruct 0.46-0.62M points at 0.51-0.53,px reprojection error; BA time grows with match density. These results show compressed-domain correspondences are a practical, resource-efficient front end with clear paths to scaling in full pipelines.