Tao Bai

CV
h-index13
17papers
1,141citations
Novelty50%
AI Score32

17 Papers

CVDec 19, 2022
AI Security for Geoscience and Remote Sensing: Challenges and Future Trends

Yonghao Xu, Tao Bai, Weikang Yu et al.

Recent advances in artificial intelligence (AI) have significantly intensified research in the geoscience and remote sensing (RS) field. AI algorithms, especially deep learning-based ones, have been developed and applied widely to RS data analysis. The successful application of AI covers almost all aspects of Earth observation (EO) missions, from low-level vision tasks like super-resolution, denoising and inpainting, to high-level vision tasks like scene classification, object detection and semantic segmentation. While AI techniques enable researchers to observe and understand the Earth more accurately, the vulnerability and uncertainty of AI models deserve further attention, considering that many geoscience and RS tasks are highly safety-critical. This paper reviews the current development of AI security in the geoscience and RS field, covering the following five important aspects: adversarial attack, backdoor attack, federated learning, uncertainty and explainability. Moreover, the potential opportunities and trends are discussed to provide insights for future research. To the best of the authors' knowledge, this paper is the first attempt to provide a systematic review of AI security-related research in the geoscience and RS community. Available code and datasets are also listed in the paper to move this vibrant field of research forward.

LGMar 31, 2023
Towards Adversarially Robust Continual Learning

Tao Bai, Chen Chen, Lingjuan Lyu et al.

Recent studies show that models trained by continual learning can achieve the comparable performances as the standard supervised learning and the learning flexibility of continual learning models enables their wide applications in the real world. Deep learning models, however, are shown to be vulnerable to adversarial attacks. Though there are many studies on the model robustness in the context of standard supervised learning, protecting continual learning from adversarial attacks has not yet been investigated. To fill in this research gap, we are the first to study adversarial robustness in continual learning and propose a novel method called \textbf{T}ask-\textbf{A}ware \textbf{B}oundary \textbf{A}ugmentation (TABA) to boost the robustness of continual learning models. With extensive experiments on CIFAR-10 and CIFAR-100, we show the efficacy of adversarial training and TABA in defending adversarial attacks.

CVJul 3, 2022
Memory-Based Label-Text Tuning for Few-Shot Class-Incremental Learning

Jinze Li, Yan Bai, Yihang Lou et al.

Few-shot class-incremental learning(FSCIL) focuses on designing learning algorithms that can continually learn a sequence of new tasks from a few samples without forgetting old ones. The difficulties are that training on a sequence of limited data from new tasks leads to severe overfitting issues and causes the well-known catastrophic forgetting problem. Existing researches mainly utilize the image information, such as storing the image knowledge of previous tasks or limiting classifiers updating. However, they ignore analyzing the informative and less noisy text information of class labels. In this work, we propose leveraging the label-text information by adopting the memory prompt. The memory prompt can learn new data sequentially, and meanwhile store the previous knowledge. Furthermore, to optimize the memory prompt without undermining the stored knowledge, we propose a stimulation-based training strategy. It optimizes the memory prompt depending on the image embedding stimulation, which is the distribution of the image embedding elements. Experiments show that our proposed method outperforms all prior state-of-the-art approaches, significantly mitigating the catastrophic forgetting and overfitting problems.

CVJul 19, 2022
Bayesian Evidential Learning for Few-Shot Classification

Xiongkun Linghu, Yan Bai, Yihang Lou et al.

Few-Shot Classification(FSC) aims to generalize from base classes to novel classes given very limited labeled samples, which is an important step on the path toward human-like machine learning. State-of-the-art solutions involve learning to find a good metric and representation space to compute the distance between samples. Despite the promising accuracy performance, how to model uncertainty for metric-based FSC methods effectively is still a challenge. To model uncertainty, We place a distribution over class probability based on the theory of evidence. As a result, uncertainty modeling and metric learning can be decoupled. To reduce the uncertainty of classification, we propose a Bayesian evidence fusion theorem. Given observed samples, the network learns to get posterior distribution parameters given the prior parameters produced by the pre-trained network. Detailed gradient analysis shows that our method provides a smooth optimization target and can capture the uncertainty. The proposed method is agnostic to metric learning strategies and can be implemented as a plug-and-play module. We integrate our method into several newest FSC methods and demonstrate the improved accuracy and uncertainty quantification on standard FSC benchmarks.

CVJan 20, 2025
On the Adversarial Vulnerabilities of Transfer Learning in Remote Sensing

Tao Bai, Xingjian Tian, Yonghao Xu et al.

The use of pretrained models from general computer vision tasks is widespread in remote sensing, significantly reducing training costs and improving performance. However, this practice also introduces vulnerabilities to downstream tasks, where publicly available pretrained models can be used as a proxy to compromise downstream models. This paper presents a novel Adversarial Neuron Manipulation method, which generates transferable perturbations by selectively manipulating single or multiple neurons in pretrained models. Unlike existing attacks, this method eliminates the need for domain-specific information, making it more broadly applicable and efficient. By targeting multiple fragile neurons, the perturbations achieve superior attack performance, revealing critical vulnerabilities in deep learning models. Experiments on diverse models and remote sensing datasets validate the effectiveness of the proposed method. This low-access adversarial neuron manipulation technique highlights a significant security risk in transfer learning models, emphasizing the urgent need for more robust defenses in their design when addressing the safety-critical remote sensing tasks.

LGNov 19, 2021
Towards Efficiently Evaluating the Robustness of Deep Neural Networks in IoT Systems: A GAN-based Method

Tao Bai, Jun Zhao, Jinlin Zhu et al.

Intelligent Internet of Things (IoT) systems based on deep neural networks (DNNs) have been widely deployed in the real world. However, DNNs are found to be vulnerable to adversarial examples, which raises people's concerns about intelligent IoT systems' reliability and security. Testing and evaluating the robustness of IoT systems becomes necessary and essential. Recently various attacks and strategies have been proposed, but the efficiency problem remains unsolved properly. Existing methods are either computationally extensive or time-consuming, which is not applicable in practice. In this paper, we propose a novel framework called Attack-Inspired GAN (AI-GAN) to generate adversarial examples conditionally. Once trained, it can generate adversarial perturbations efficiently given input images and target classes. We apply AI-GAN on different datasets in white-box settings, black-box settings and targeted models protected by state-of-the-art defenses. Through extensive experiments, AI-GAN achieves high attack success rates, outperforming existing methods, and reduces generation time significantly. Moreover, for the first time, AI-GAN successfully scales to complex datasets e.g. CIFAR-100 and ImageNet, with about $90\%$ success rates among all classes.

CVOct 15, 2021
Adversarial Purification through Representation Disentanglement

Tao Bai, Jun Zhao, Lanqing Guo et al.

Deep learning models are vulnerable to adversarial examples and make incomprehensible mistakes, which puts a threat on their real-world deployment. Combined with the idea of adversarial training, preprocessing-based defenses are popular and convenient to use because of their task independence and good generalizability. Current defense methods, especially purification, tend to remove ``noise" by learning and recovering the natural images. However, different from random noise, the adversarial patterns are much easier to be overfitted during model training due to their strong correlation to the images. In this work, we propose a novel adversarial purification scheme by presenting disentanglement of natural images and adversarial perturbations as a preprocessing defense. With extensive experiments, our defense is shown to be generalizable and make significant protection against unseen strong adversarial attacks. It reduces the success rates of state-of-the-art \textbf{ensemble} attacks from \textbf{61.7\%} to \textbf{14.9\%} on average, superior to a number of existing methods. Notably, our defense restores the perturbed images perfectly and does not hurt the clean accuracy of backbone models, which is highly desirable in practice.

CVAug 7, 2021
Neighborhood Consensus Contrastive Learning for Backward-Compatible Representation

Shengsen Wu, Liang Chen, Yihang Lou et al.

In object re-identification (ReID), the development of deep learning techniques often involves model updates and deployment. It is unbearable to re-embedding and re-index with the system suspended when deploying new models. Therefore, backward-compatible representation is proposed to enable "new" features to be compared with "old" features directly, which means that the database is active when there are both "new" and "old" features in it. Thus we can scroll-refresh the database or even do nothing on the database to update. The existing backward-compatible methods either require a strong overlap between old and new training data or simply conduct constraints at the instance level. Thus they are difficult in handling complicated cluster structures and are limited in eliminating the impact of outliers in old embeddings, resulting in a risk of damaging the discriminative capability of new features. In this work, we propose a Neighborhood Consensus Contrastive Learning (NCCL) method. With no assumptions about the new training data, we estimate the sub-cluster structures of old embeddings. A new embedding is constrained with multiple old embeddings in both embedding space and discrimination space at the sub-class level. The effect of outliers diminished, as the multiple samples serve as "mean teachers". Besides, we also propose a scheme to filter the old embeddings with low credibility, further improving the compatibility robustness. Our method ensures backward compatibility without impairing the accuracy of the new model. And it can even improve the new model's accuracy in most scenarios.

CVJun 29, 2021
Inconspicuous Adversarial Patches for Fooling Image Recognition Systems on Mobile Devices

Tao Bai, Jinqi Luo, Jun Zhao

Deep learning based image recognition systems have been widely deployed on mobile devices in today's world. In recent studies, however, deep learning models are shown vulnerable to adversarial examples. One variant of adversarial examples, called adversarial patch, draws researchers' attention due to its strong attack abilities. Though adversarial patches achieve high attack success rates, they are easily being detected because of the visual inconsistency between the patches and the original images. Besides, it usually requires a large amount of data for adversarial patch generation in the literature, which is computationally expensive and time-consuming. To tackle these challenges, we propose an approach to generate inconspicuous adversarial patches with one single image. In our approach, we first decide the patch locations basing on the perceptual sensitivity of victim models, then produce adversarial patches in a coarse-to-fine way by utilizing multiple-scale generators and discriminators. The patches are encouraged to be consistent with the background images with adversarial training while preserving strong attack abilities. Our approach shows the strong attack abilities in white-box settings and the excellent transferability in black-box settings through extensive experiments on various models with different architectures and training methods. Compared to other adversarial patches, our adversarial patches hold the most negligible risks to be detected and can evade human observations, which is supported by the illustrations of saliency maps and results of user evaluations. Lastly, we show that our adversarial patches can be applied in the physical world.

LGFeb 2, 2021
Recent Advances in Adversarial Training for Adversarial Robustness

Tao Bai, Jinqi Luo, Jun Zhao et al.

Adversarial training is one of the most effective approaches defending against adversarial examples for deep learning models. Unlike other defense strategies, adversarial training aims to promote the robustness of models intrinsically. During the last few years, adversarial training has been studied and discussed from various aspects. A variety of improvements and developments of adversarial training are proposed, which were, however, neglected in existing surveys. For the first time in this survey, we systematically review the recent progress on adversarial training for adversarial robustness with a novel taxonomy. Then we discuss the generalization problems in adversarial training from three perspectives. Finally, we highlight the challenges which are not fully tackled and present potential future directions.

LGNov 3, 2020
Recent Advances in Understanding Adversarial Robustness of Deep Neural Networks

Tao Bai, Jinqi Luo, Jun Zhao

Adversarial examples are inevitable on the road of pervasive applications of deep neural networks (DNN). Imperceptible perturbations applied on natural samples can lead DNN-based classifiers to output wrong prediction with fair confidence score. It is increasingly important to obtain models with high robustness that are resistant to adversarial examples. In this paper, we survey recent advances in how to understand such intriguing property, i.e. adversarial robustness, from different perspectives. We give preliminary definitions on what adversarial attacks and robustness are. After that, we study frequently-used benchmarks and mention theoretically-proved bounds for adversarial robustness. We then provide an overview on analyzing correlations among adversarial robustness and other critical indicators of DNN models. Lastly, we introduce recent arguments on potential costs of adversarial training which have attracted wide attention from the research community.

LGSep 21, 2020
Feature Distillation With Guided Adversarial Contrastive Learning

Tao Bai, Jinnan Chen, Jun Zhao et al.

Deep learning models are shown to be vulnerable to adversarial examples. Though adversarial training can enhance model robustness, typical approaches are computationally expensive. Recent works proposed to transfer the robustness to adversarial attacks across different tasks or models with soft labels.Compared to soft labels, feature contains rich semantic information and holds the potential to be applied to different downstream tasks. In this paper, we propose a novel approach called Guided Adversarial Contrastive Distillation (GACD), to effectively transfer adversarial robustness from teacher to student with features. We first formulate this objective as contrastive learning and connect it with mutual information. With a well-trained teacher model as an anchor, students are expected to extract features similar to the teacher. Then considering the potential errors made by teachers, we propose sample reweighted estimation to eliminate the negative effects from teachers. With GACD, the student not only learns to extract robust features, but also captures structural knowledge from the teacher. By extensive experiments evaluating over popular datasets such as CIFAR-10, CIFAR-100 and STL-10, we demonstrate that our approach can effectively transfer robustness across different models and even different tasks, and achieve comparable or better results than existing methods. Besides, we provide a detailed analysis of various methods, showing that students produced by our approach capture more structural knowledge from teachers and learn more robust features under adversarial attacks.

CVSep 21, 2020
Generating Adversarial yet Inconspicuous Patches with a Single Image

Jinqi Luo, Tao Bai, Jun Zhao

Deep neural networks have been shown vulnerable toadversarial patches, where exotic patterns can resultin models wrong prediction. Nevertheless, existing ap-proaches to adversarial patch generation hardly con-sider the contextual consistency between patches andthe image background, causing such patches to be eas-ily detected and adversarial attacks to fail. On the otherhand, these methods require a large amount of data fortraining, which is computationally expensive. To over-come these challenges, we propose an approach to gen-erate adversarial yet inconspicuous patches with onesingle image. In our approach, adversarial patches areproduced in a coarse-to-fine way with multiple scalesof generators and discriminators. Contextual informa-tion is encoded during the Min-Max training to makepatches consistent with surroundings. The selection ofpatch location is based on the perceptual sensitivity ofvictim models. Through extensive experiments, our ap-proach shows strong attacking ability in both the white-box and black-box setting. Experiments on saliency de-tection and user evaluation indicate that our adversar-ial patches can evade human observations, demonstratethe inconspicuousness of our approach. Lastly, we showthat our approach preserves the attack ability in thephysical world.

LGFeb 6, 2020
AI-GAN: Attack-Inspired Generation of Adversarial Examples

Tao Bai, Jun Zhao, Jinlin Zhu et al.

Deep neural networks (DNNs) are vulnerable to adversarial examples, which are crafted by adding imperceptible perturbations to inputs. Recently different attacks and strategies have been proposed, but how to generate adversarial examples perceptually realistic and more efficiently remains unsolved. This paper proposes a novel framework called Attack-Inspired GAN (AI-GAN), where a generator, a discriminator, and an attacker are trained jointly. Once trained, it can generate adversarial perturbations efficiently given input images and target classes. Through extensive experiments on several popular datasets \eg MNIST and CIFAR-10, AI-GAN achieves high attack success rates and reduces generation time significantly in various settings. Moreover, for the first time, AI-GAN successfully scales to complicated datasets \eg CIFAR-100 with around $90\%$ success rates among all classes.

CRNov 27, 2019
Reviewing and Improving the Gaussian Mechanism for Differential Privacy

Jun Zhao, Teng Wang, Tao Bai et al.

Differential privacy provides a rigorous framework to quantify data privacy, and has received considerable interest recently. A randomized mechanism satisfying $(ε, δ)$-differential privacy (DP) roughly means that, except with a small probability $δ$, altering a record in a dataset cannot change the probability that an output is seen by more than a multiplicative factor $e^ε $. A well-known solution to $(ε, δ)$-DP is the Gaussian mechanism initiated by Dwork et al. [1] in 2006 with an improvement by Dwork and Roth [2] in 2014, where a Gaussian noise amount $\sqrt{2\ln \frac{2}δ} \times \fracΔε$ of [1] or $\sqrt{2\ln \frac{1.25}δ} \times \fracΔε$ of [2] is added independently to each dimension of the query result, for a query with $\ell_2$-sensitivity $Δ$. Although both classical Gaussian mechanisms [1,2] assume $0 < ε\leq 1$, our review finds that many studies in the literature have used the classical Gaussian mechanisms under values of $ε$ and $δ$ where the added noise amounts of [1,2] do not achieve $(ε,δ)$-DP. We obtain such result by analyzing the optimal noise amount $σ_{DP-OPT}$ for $(ε,δ)$-DP and identifying $ε$ and $δ$ where the noise amounts of classical mechanisms are even less than $σ_{DP-OPT}$. Since $σ_{DP-OPT}$ has no closed-form expression and needs to be approximated in an iterative manner, we propose Gaussian mechanisms by deriving closed-form upper bounds for $σ_{DP-OPT}$. Our mechanisms achieve $(ε,δ)$-DP for any $ε$, while the classical mechanisms [1,2] do not achieve $(ε,δ)$-DP for large $ε$ given $δ$. Moreover, the utilities of our mechanisms improve those of [1,2] and are close to that of the optimal yet more computationally expensive Gaussian mechanism.

CRJul 11, 2019
Conditional Analysis for Key-Value Data with Local Differential Privacy

Lin Sun, Jun Zhao, Xiaojun Ye et al.

Local differential privacy (LDP) has been deemed as the de facto measure for privacy-preserving distributed data collection and analysis. Recently, researchers have extended LDP to the basic data type in NoSQL systems: the key-value data, and show its feasibilities in mean estimation and frequency estimation. In this paper, we develop a set of new perturbation mechanisms for key-value data collection and analysis under the strong model of local differential privacy. Since many modern machine learning tasks rely on the availability of conditional probability or the marginal statistics, we then propose the conditional frequency estimation method for key analysis and the conditional mean estimation for value analysis in key-value data. The released statistics with conditions can further be used in learning tasks. Extensive experiments of frequency and mean estimation on both synthetic and real-world datasets validate the effectiveness and accuracy of the proposed key-value perturbation mechanisms against the state-of-art competitors.

CVJun 24, 2015
Targeting Ultimate Accuracy: Face Recognition via Deep Embedding

Jingtuo Liu, Yafeng Deng, Tao Bai et al.

Face Recognition has been studied for many decades. As opposed to traditional hand-crafted features such as LBP and HOG, much more sophisticated features can be learned automatically by deep learning methods in a data-driven way. In this paper, we propose a two-stage approach that combines a multi-patch deep CNN and deep metric learning, which extracts low dimensional but very discriminative features for face verification and recognition. Experiments show that this method outperforms other state-of-the-art methods on LFW dataset, achieving 99.77% pair-wise verification accuracy and significantly better accuracy under other two more practical protocols. This paper also discusses the importance of data size and the number of patches, showing a clear path to practical high-performance face recognition systems in real world.