MaskFlow: Object-Aware Motion Estimation
This addresses motion estimation for computer vision applications, particularly in complex scenes, but is incremental as it builds on existing DNN-based methods with object-aware enhancements.
The authors tackled motion estimation in challenging scenarios with small objects, large displacements, and appearance changes by introducing MaskFlow, which uses object-level features and segmentations to approximate translation motion fields and refines them, resulting in outperforming state-of-the-art methods on a new synthetic dataset while achieving comparable results on FlyingThings3D.
We introduce a novel motion estimation method, MaskFlow, that is capable of estimating accurate motion fields, even in very challenging cases with small objects, large displacements and drastic appearance changes. In addition to lower-level features, that are used in other Deep Neural Network (DNN)-based motion estimation methods, MaskFlow draws from object-level features and segmentations. These features and segmentations are used to approximate the objects' translation motion field. We propose a novel and effective way of incorporating the incomplete translation motion field into a subsequent motion estimation network for refinement and completion. We also produced a new challenging synthetic dataset with motion field ground truth, and also provide extra ground truth for the object-instance matchings and corresponding segmentation masks. We demonstrate that MaskFlow outperforms state of the art methods when evaluated on our new challenging dataset, whilst still producing comparable results on the popular FlyingThings3D benchmark dataset.