CV AIJan 25, 2024

Progressive Multi-task Anti-Noise Learning and Distilling Frameworks for Fine-grained Vehicle Recognition

arXiv:2401.14336v15 citationsHas CodeIEEE transactions on intelligent transportation systems (Print)

Originality Incremental advance

AI Analysis

This work solves the challenge of image noise in FGVR for intelligent transportation systems, offering an incremental improvement by integrating denoising as an auxiliary task.

The paper tackles the problem of fine-grained vehicle recognition (FGVR) by addressing intra-class variation due to image noise, which previous studies had overlooked. It proposes a progressive multi-task anti-noise learning (PMAL) framework and a distilling (PMD) framework, achieving state-of-the-art recognition accuracy on multiple datasets, including Stanford Cars and CompCars, without additional computational overhead.

Fine-grained vehicle recognition (FGVR) is an essential fundamental technology for intelligent transportation systems, but very difficult because of its inherent intra-class variation. Most previous FGVR studies only focus on the intra-class variation caused by different shooting angles, positions, etc., while the intra-class variation caused by image noise has received little attention. This paper proposes a progressive multi-task anti-noise learning (PMAL) framework and a progressive multi-task distilling (PMD) framework to solve the intra-class variation problem in FGVR due to image noise. The PMAL framework achieves high recognition accuracy by treating image denoising as an additional task in image recognition and progressively forcing a model to learn noise invariance. The PMD framework transfers the knowledge of the PMAL-trained model into the original backbone network, which produces a model with about the same recognition accuracy as the PMAL-trained model, but without any additional overheads over the original backbone network. Combining the two frameworks, we obtain models that significantly exceed previous state-of-the-art methods in recognition accuracy on two widely-used, standard FGVR datasets, namely Stanford Cars, and CompCars, as well as three additional surveillance image-based vehicle-type classification datasets, namely Beijing Institute of Technology (BIT)-Vehicle, Vehicle Type Image Data 2 (VTID2), and Vehicle Images Dataset for Make Model Recognition (VIDMMR), without any additional overheads over the original backbone networks. The source code is available at https://github.com/Dichao-Liu/Anti-noise_FGVR

View on arXiv PDF Code

Similar