Hyun Kwon, Yongchul Kim, H. Yoon
Deep neural networks (DNNs) have useful applications in machine learning tasks involving recognition and pattern analysis. Despite the favorable applications of DNNs, these systems can be exploited by adversarial examples. An adversarial example, which is created by adding a small amount of noise to an original sample, can cause misclassification by the DNN. Under specific circumstances, it may be necessary to create a selective untargeted adversarial example that will not be classified as certain avoided classes. Such is the case, for example, if a modified tank cover can cause misclassification by a DNN, but the bandit equipped with the DNN must misclassify the modified tank as a class other than certain avoided classes, such as a tank, armored vehicle, or self-propelled gun. That is, selective untargeted adversarial examples are needed that will not be perceived as certain classes, such as tanks, armored vehicles, or self-propelled guns. In this study, we propose a selective untargeted adversarial example that exhibits 100% attack success with minimum distortions. The proposed scheme creates a selective untargeted adversarial example that will not be classified as certain avoided classes while minimizing distortions in the original sample. To generate untargeted adversarial examples, a transformation is performed to minimize the probability of certain avoided classes and distortions in the original sample. As experimental datasets, we used MNIST and CIFAR-10, including the Tensorflow library. The experimental results demonstrate that the proposed scheme creates a selective untargeted adversarial example that exhibits 100% attack success with minimum distortions (1.325 and 34.762 for MNIST and CIFAR-10, respectively).