Learning to Detect Instantaneous Changes with Retrospective Convolution and Static Sample Synthesis
This addresses a specific challenge in computer vision for applications requiring quick change detection, but it is incremental as it builds on existing spatio-temporal convolutional networks.
The paper tackles the problem of detecting instantaneous changes in video with only a few preceding frames, proposing a retrospective convolution and static sample synthesis method that significantly outperforms existing methods in accuracy and robustness.
Change detection has been a challenging visual task due to the dynamic nature of real-world scenes. Good performance of existing methods depends largely on prior background images or a long-term observation. These methods, however, suffer severe degradation when they are applied to detection of instantaneously occurred changes with only a few preceding frames provided. In this paper, we exploit spatio-temporal convolutional networks to address this challenge, and propose a novel retrospective convolution, which features efficient change information extraction between the current frame and frames from historical observation. To address the problem of foreground-specific over-fitting in learning-based methods, we further propose a data augmentation method, named static sample synthesis, to guide the network to focus on learning change-cued information rather than specific spatial features of foreground. Trained end-to-end with complex scenarios, our framework proves to be accurate in detecting instantaneous changes and robust in combating diverse noises. Extensive experiments demonstrate that our proposed method significantly outperforms existing methods.