Siamese Infrared and Visible Light Fusion Network for RGB-T Tracking
This addresses robust object tracking under varying light conditions for computer vision applications, but it is incremental as it builds on existing siamese and fusion methods.
The paper tackled the problem of tracking using RGB-T (infrared and visible light) image pairs by proposing SiamIVFN, a siamese network with complementary-feature-fusion and contribution-aggregation subnetworks, achieving state-of-the-art performance with a tracking speed of 147.6 FPS.
Due to the different photosensitive properties of infrared and visible light, the registered RGB-T image pairs shot in the same scene exhibit quite different characteristics. This paper proposes a siamese infrared and visible light fusion Network (SiamIVFN) for RBG-T image-based tracking. SiamIVFN contains two main subnetworks: a complementary-feature-fusion network (CFFN) and a contribution-aggregation network (CAN). CFFN utilizes a two-stream multilayer convolutional structure whose filters for each layer are partially coupled to fuse the features extracted from infrared images and visible light images. CFFN is a feature-level fusion network, which can cope with the misalignment of the RGB-T image pairs. Through adaptively calculating the contributions of infrared and visible light features obtained from CFFN, CAN makes the tracker robust under various light conditions. Experiments on two RGB-T tracking benchmark datasets demonstrate that the proposed SiamIVFN has achieved state-of-the-art performance. The tracking speed of SiamIVFN is 147.6FPS, the current fastest RGB-T fusion tracker.