AdCorDA: Classifier Refinement via Adversarial Correction and Domain Adaptation
This work addresses the need for improved classifier performance and robustness, particularly for weight-quantized networks, though it appears incremental as it builds on existing adversarial and domain adaptation techniques.
The paper tackles the problem of refining pretrained classifiers by introducing AdCorDA, a two-stage method that uses adversarial correction and domain adaptation to improve accuracy, achieving over 5% accuracy boost on CIFAR-100 and enhancing robustness to adversarial attacks.
This paper describes a simple yet effective technique for refining a pretrained classifier network. The proposed AdCorDA method is based on modification of the training set and making use of the duality between network weights and layer inputs. We call this input space training. The method consists of two stages - adversarial correction followed by domain adaptation. Adversarial correction uses adversarial attacks to correct incorrect training-set classifications. The incorrectly classified samples of the training set are removed and replaced with the adversarially corrected samples to form a new training set, and then, in the second stage, domain adaptation is performed back to the original training set. Extensive experimental validations show significant accuracy boosts of over 5% on the CIFAR-100 dataset. The technique can be straightforwardly applied to refinement of weight-quantized neural networks, where experiments show substantial enhancement in performance over the baseline. The adversarial correction technique also results in enhanced robustness to adversarial attacks.