Omer Faruk Tuna

LG
3papers
47citations
Novelty52%
AI Score24

3 Papers

LGFeb 15, 2022
Unreasonable Effectiveness of Last Hidden Layer Activations for Adversarial Robustness

Omer Faruk Tuna, Ferhat Ozgur Catak, M. Taner Eskil

In standard Deep Neural Network (DNN) based classifiers, the general convention is to omit the activation function in the last (output) layer and directly apply the softmax function on the logits to get the probability scores of each class. In this type of architectures, the loss value of the classifier against any output class is directly proportional to the difference between the final probability score and the label value of the associated class. Standard White-box adversarial evasion attacks, whether targeted or untargeted, mainly try to exploit the gradient of the model loss function to craft adversarial samples and fool the model. In this study, we show both mathematically and experimentally that using some widely known activation functions in the output layer of the model with high temperature values has the effect of zeroing out the gradients for both targeted and untargeted attack cases, preventing attackers from exploiting the model's loss function to craft adversarial samples. We've experimentally verified the efficacy of our approach on MNIST (Digit), CIFAR10 datasets. Detailed experiments confirmed that our approach substantially improves robustness against gradient-based targeted and untargeted attack threats. And, we showed that the increased non-linearity at the output layer has some additional benefits against some other attack methods like Deepfool attack.

LGFeb 8, 2021
Exploiting epistemic uncertainty of the deep learning models to generate adversarial samples

Omer Faruk Tuna, Ferhat Ozgur Catak, M. Taner Eskil

Deep neural network architectures are considered to be robust to random perturbations. Nevertheless, it was shown that they could be severely vulnerable to slight but carefully crafted perturbations of the input, termed as adversarial samples. In recent years, numerous studies have been conducted in this new area called "Adversarial Machine Learning" to devise new adversarial attacks and to defend against these attacks with more robust DNN architectures. However, almost all the research work so far has been concentrated on utilising model loss function to craft adversarial examples or create robust models. This study explores the usage of quantified epistemic uncertainty obtained from Monte-Carlo Dropout Sampling for adversarial attack purposes by which we perturb the input to the areas where the model has not seen before. We proposed new attack ideas based on the epistemic uncertainty of the model. Our results show that our proposed hybrid attack approach increases the attack success rates from 82.59% to 85.40%, 82.86% to 89.92% and 88.06% to 90.03% on MNIST Digit, MNIST Fashion and CIFAR-10 datasets, respectively.

LGDec 11, 2020
Closeness and Uncertainty Aware Adversarial Examples Detection in Adversarial Machine Learning

Omer Faruk Tuna, Ferhat Ozgur Catak, M. Taner Eskil

While state-of-the-art Deep Neural Network (DNN) models are considered to be robust to random perturbations, it was shown that these architectures are highly vulnerable to deliberately crafted perturbations, albeit being quasi-imperceptible. These vulnerabilities make it challenging to deploy DNN models in security-critical areas. In recent years, many research studies have been conducted to develop new attack methods and come up with new defense techniques that enable more robust and reliable models. In this work, we explore and assess the usage of different type of metrics for detecting adversarial samples. We first leverage the usage of moment-based predictive uncertainty estimates of a DNN classifier obtained using Monte-Carlo Dropout Sampling. And we also introduce a new method that operates in the subspace of deep features extracted by the model. We verified the effectiveness of our approach on a range of standard datasets like MNIST (Digit), MNIST (Fashion) and CIFAR-10. Our experiments show that these two different approaches complement each other, and the combined usage of all the proposed metrics yields up to 99 \% ROC-AUC scores regardless of the attack algorithm.