Disen Hu

11.1CVMay 5Code

Multimodal Learning on Low-Quality Data with Conformal Predictive Self-Calibration

Xun Jiang, Yufan Gu, Disen Hu et al.

Multimodal learning often grapples with the challenge of low-quality data, which predominantly manifests as two facets: modality imbalance and noisy corruption. While these issues are often studied in isolation, we argue that they share a common root in the predictive uncertainty towards the reliability of individual modalities and instances during learning. In this paper, we propose a unified framework, termed Conformal Predictive Self-Calibration (CPSC), which leverages conformal prediction to equip the model with the ability to perform self-guided calibration on-the-fly. The core of our proposed CPSC lies in a novel self-calibrating training loop that seamlessly integrates two key modules: (1) Representation Self-Calibration, which decomposes unimodal features into components, and selectively fuses the most robust ones identified by a conformal predictor to enhance feature resilience. (2) Gradient Self-Calibration, which recalibrates the gradient flow during backpropagation based on instance-wise reliability scores, steering the optimization towards more trustworthy directions. Furthermore, we also devise a self-update strategy for the conformal predictor to ensure the entire system co-evolves consistently throughout the training process. Extensive experiments on six benchmark datasets under both imbalanced and noisy settings demonstrate that our CPSC framework consistently outperforms existing state-of-the-art methods. Our code is available at https://github.com/XunCHN/CPSC.

5.7CVOct 25, 2022Code

Object recognition in atmospheric turbulence scenes

Disen Hu, Nantheera Anantrasirichai

The influence of atmospheric turbulence on acquired surveillance imagery poses significant challenges in image interpretation and scene analysis. Conventional approaches for target classification and tracking are less effective under such conditions. While deep-learning-based object detection methods have shown great success in normal conditions, they cannot be directly applied to atmospheric turbulence sequences. In this paper, we propose a novel framework that learns distorted features to detect and classify object types in turbulent environments. Specifically, we utilise deformable convolutions to handle spatial turbulent displacement. Features are extracted using a feature pyramid network, and Faster R-CNN is employed as the object detector. Experimental results on a synthetic VOC dataset demonstrate that the proposed framework outperforms the benchmark with a mean Average Precision (mAP) score exceeding 30%. Additionally, subjective results on real data show significant improvement in performance.

Disen Hu

2 Papers