LIFT: Learned Invariant Feature Transform
This work addresses the challenge of integrating separate feature point processing steps for computer vision applications, offering a unified solution that improves performance.
The authors tackled the problem of feature point handling by introducing a deep network that unifies detection, orientation estimation, and description into a single end-to-end differentiable pipeline, and demonstrated that it outperforms state-of-the-art methods on multiple benchmark datasets without retraining.
We introduce a novel Deep Network architecture that implements the full feature point handling pipeline, that is, detection, orientation estimation, and feature description. While previous works have successfully tackled each one of these problems individually, we show how to learn to do all three in a unified manner while preserving end-to-end differentiability. We then demonstrate that our Deep pipeline outperforms state-of-the-art methods on a number of benchmark datasets, without the need of retraining.