Improving Weakly-Supervised Object Localization By Micro-Annotation
This addresses a specific failure case in weakly-supervised object localization for computer vision researchers, but it is incremental as it builds on existing methods with minor annotation.
The paper tackled the problem of weakly-supervised object localization failing for object classes that co-occur with consistent background elements, such as trains on tracks, by proposing a method that adds a small amount of model-specific annotation through clustering mid-level representations. The result showed substantially improved localization results on the ILSVC2014 dataset for bounding box detection and the PASCAL VOC2012 dataset for semantic segmentation.
Weakly-supervised object localization methods tend to fail for object classes that consistently co-occur with the same background elements, e.g. trains on tracks. We propose a method to overcome these failures by adding a very small amount of model-specific additional annotation. The main idea is to cluster a deep network's mid-level representations and assign object or distractor labels to each cluster. Experiments show substantially improved localization results on the challenging ILSVC2014 dataset for bounding box detection and the PASCAL VOC2012 dataset for semantic segmentation.