CVAIMay 23, 2023

Deep Transductive Transfer Learning for Automatic Target Recognition

arXiv:2305.13886v15 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of building robust ATR classifiers in new domains without annotations, which is incremental as it combines existing methods like CycleGAN and transfer learning for a specific application.

The paper tackles the problem of automatic target recognition (ATR) when labeled data is available only in a source domain (e.g., infrared) but not in a target domain (e.g., visible), proposing an unpaired transductive transfer learning framework using CycleGAN and a pretrained classifier to adapt to the target domain without labeled data, achieving 71.56% accuracy on the DSIAC ATR dataset.

One of the major obstacles in designing an automatic target recognition (ATR) algorithm, is that there are often labeled images in one domain (i.e., infrared source domain) but no annotated images in the other target domains (i.e., visible, SAR, LIDAR). Therefore, automatically annotating these images is essential to build a robust classifier in the target domain based on the labeled images of the source domain. Transductive transfer learning is an effective way to adapt a network to a new target domain by utilizing a pretrained ATR network in the source domain. We propose an unpaired transductive transfer learning framework where a CycleGAN model and a well-trained ATR classifier in the source domain are used to construct an ATR classifier in the target domain without having any labeled data in the target domain. We employ a CycleGAN model to transfer the mid-wave infrared (MWIR) images to visible (VIS) domain images (or visible to MWIR domain). To train the transductive CycleGAN, we optimize a cost function consisting of the adversarial, identity, cycle-consistency, and categorical cross-entropy loss for both the source and target classifiers. In this paper, we perform a detailed experimental analysis on the challenging DSIAC ATR dataset. The dataset consists of ten classes of vehicles at different poses and distances ranging from 1-5 kilometers on both the MWIR and VIS domains. In our experiment, we assume that the images in the VIS domain are the unlabeled target dataset. We first detect and crop the vehicles from the raw images and then project them into a common distance of 2 kilometers. Our proposed transductive CycleGAN achieves 71.56% accuracy in classifying the visible domain vehicles in the DSIAC ATR dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes