CVAILGIVMar 29, 2022

Treatment Learning Causal Transformer for Noisy Image Classification

NVIDIA
arXiv:2203.15529v210 citationsh-index: 22
AI Analysis

This addresses the challenge of noisy data in image classification for computer vision applications, representing an incremental improvement through a novel hybrid method.

The paper tackles the problem of degraded performance of deep learning vision models on noisy images by incorporating binary noise existence as a treatment factor to improve prediction accuracy, achieving superior performance validated by refutation metrics and also enhancing visual salience methods.

Current top-notch deep learning (DL) based vision models are primarily based on exploring and exploiting the inherent correlations between training data samples and their associated labels. However, a known practical challenge is their degraded performance against "noisy" data, induced by different circumstances such as spurious correlations, irrelevant contexts, domain shift, and adversarial attacks. In this work, we incorporate this binary information of "existence of noise" as treatment into image classification tasks to improve prediction accuracy by jointly estimating their treatment effects. Motivated from causal variational inference, we propose a transformer-based architecture, Treatment Learning Causal Transformer (TLT), that uses a latent generative model to estimate robust feature representations from current observational input for noise image classification. Depending on the estimated noise level (modeled as a binary treatment factor), TLT assigns the corresponding inference network trained by the designed causal loss for prediction. We also create new noisy image datasets incorporating a wide range of noise factors (e.g., object masking, style transfer, and adversarial perturbation) for performance benchmarking. The superior performance of TLT in noisy image classification is further validated by several refutation evaluation metrics. As a by-product, TLT also improves visual salience methods for perceiving noisy images.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes