A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection
This addresses the challenge of handling long-tail occlusions in object detection, offering a data augmentation alternative to large-scale datasets.
The paper tackles the problem of learning object detectors invariant to rare occlusions and deformations by proposing an adversarial network that generates hard positive examples, resulting in a 2.3% mAP boost on VOC07 and 2.6% on VOC2012 compared to Fast-RCNN.
How do we learn an object detector that is invariant to occlusions and deformations? Our current solution is to use a data-driven strategy -- collect large-scale datasets which have object instances under different conditions. The hope is that the final classifier can use these examples to learn invariances. But is it really possible to see all the occlusions in a dataset? We argue that like categories, occlusions and object deformations also follow a long-tail. Some occlusions and deformations are so rare that they hardly happen; yet we want to learn a model invariant to such occurrences. In this paper, we propose an alternative solution. We propose to learn an adversarial network that generates examples with occlusions and deformations. The goal of the adversary is to generate examples that are difficult for the object detector to classify. In our framework both the original detector and adversary are learned in a joint manner. Our experimental results indicate a 2.3% mAP boost on VOC07 and a 2.6% mAP boost on VOC2012 object detection challenge compared to the Fast-RCNN pipeline. We also release the code for this paper.