ZeroSCD: Zero-Shot Street Scene Change Detection
This addresses the costly data annotation issue in computer vision and robotics for street scene analysis, though it is incremental as it builds on existing models.
The paper tackles the problem of scene change detection without requiring training data by proposing ZeroSCD, a zero-shot framework that uses pre-existing place recognition and semantic segmentation models to identify changes, achieving superior accuracy on benchmark datasets compared to state-of-the-art methods.
Scene Change Detection is a challenging task in computer vision and robotics that aims to identify differences between two images of the same scene captured at different times. Traditional change detection methods rely on training models that take these image pairs as input and estimate the changes, which requires large amounts of annotated data, a costly and time-consuming process. To overcome this, we propose ZeroSCD, a zero-shot scene change detection framework that eliminates the need for training. ZeroSCD leverages pre-existing models for place recognition and semantic segmentation, utilizing their features and outputs to perform change detection. In this framework, features extracted from the place recognition model are used to estimate correspondences and detect changes between the two images. These are then combined with segmentation results from the semantic segmentation model to precisely delineate the boundaries of the detected changes. Extensive experiments on benchmark datasets demonstrate that ZeroSCD outperforms several state-of-the-art methods in change detection accuracy, despite not being trained on any of the benchmark datasets, proving its effectiveness and adaptability across different scenarios.