Interactive Instance Annotation with Siamese Networks
This addresses the labor-intensive task of instance annotation for users in cross-domain scenarios, offering a novel approach with strong results.
The paper tackles the problem of time-consuming instance mask annotation by proposing SiamAnno, a framework that uses Siamese networks for one-shot learning to predict object boundaries from a bounding box, achieving state-of-the-art performance across multiple datasets without fine-tuning.
Annotating instance masks is time-consuming and labor-intensive. A promising solution is to predict contours using a deep learning model and then allow users to refine them. However, most existing methods focus on in-domain scenarios, limiting their effectiveness for cross-domain annotation tasks. In this paper, we propose SiamAnno, a framework inspired by the use of Siamese networks in object tracking. SiamAnno leverages one-shot learning to annotate previously unseen objects by taking a bounding box as input and predicting object boundaries, which can then be adjusted by annotators. Trained on one dataset and tested on another without fine-tuning, SiamAnno achieves state-of-the-art (SOTA) performance across multiple datasets, demonstrating its ability to handle domain and environment shifts in cross-domain tasks. We also provide more comprehensive results compared to previous work, establishing a strong baseline for future research. To our knowledge, SiamAnno is the first model to explore Siamese architecture for instance annotation.