Regression-Based Image Alignment for General Object Categories
This work addresses a bottleneck in computer vision by enabling efficient alignment for diverse object categories, though it is incremental as it adapts existing methods to new features.
The paper tackled the problem of extending gradient-descent image alignment methods, like Lucas Kanade, to general object categories by incorporating non-linear feature transforms such as Dense SIFT via regression, enabling robust matching while maintaining fast convergence and handling high-dimensional warps, with results demonstrated on ImageNet objects and an unsupervised joint alignment extension.
Gradient-descent methods have exhibited fast and reliable performance for image alignment in the facial domain, but have largely been ignored by the broader vision community. They require the image function be smooth and (numerically) differentiable -- properties that hold for pixel-based representations obeying natural image statistics, but not for more general classes of non-linear feature transforms. We show that transforms such as Dense SIFT can be incorporated into a Lucas Kanade alignment framework by predicting descent directions via regression. This enables robust matching of instances from general object categories whilst maintaining desirable properties of Lucas Kanade such as the capacity to handle high-dimensional warp parametrizations and a fast rate of convergence. We present alignment results on a number of objects from ImageNet, and an extension of the method to unsupervised joint alignment of objects from a corpus of images.