Robustness Evaluation and Adversarial Training of an Instance Segmentation Model
This work addresses robustness issues in non-classifier models like instance segmentation, which is crucial for safety-critical applications such as autonomous driving, though it is incremental as it adapts existing adversarial training methods to a new model type.
The paper tackled the problem of evaluating and improving the robustness of non-classifier models, specifically an instance segmentation network, by proposing probabilistic local equivalence for robustness evaluation and applying adversarial training with TRADES loss, achieving a symmetric best dice score of 0.85 on the TuSimple lane detection challenge compared to 0.82 for standard training, and an F-measure of 0.49 on manipulated inputs versus 0 for standard training.
To evaluate the robustness of non-classifier models, we propose probabilistic local equivalence, based on the notion of randomized smoothing, as a way to quantitatively evaluate the robustness of an arbitrary function. In addition, to understand the effect of adversarial training on non-classifiers and to investigate the level of robustness that can be obtained without degrading performance on the training distribution, we apply Fast is Better than Free adversarial training together with the TRADES robust loss to the training of an instance segmentation network. In this direction, we were able to achieve a symmetric best dice score of 0.85 on the TuSimple lane detection challenge, outperforming the standardly-trained network's score of 0.82. Additionally, we were able to obtain an F-measure of 0.49 on manipulated inputs, in contrast to the standardly-trained network's score of 0. We show that probabilisitic local equivalence is able to successfully distinguish between standardly-trained and adversarially-trained models, providing another view of the improved robustness of the adversarially-trained models.