A useful taxonomy for adversarial robustness of Neural Networks
Main Article Content
Abstract
Adversarial attacks and defenses are currently active areas of research for the deep learning community. A recent review paper divided the defense approaches into three categories; gradient masking, robust optimization, and adversarial example detection. We divide gradient masking and robust optimization differently: (1) increasing intra-class compactness and inter-class separation of the feature vectors improves adversarial robustness, and (2) marginalization or removal of non-robust image features also improves adversarial robustness. By reframing these topics differently, we provide a fresh perspective that provides insight into the underlying factors that enable training more robust networks and can help inspire novel solutions. In addition, there are several papers in the literature of adversarial defenses that claim there is a cost for adversarial robustness, or a trade-off between robustness and accuracy but, under this proposed taxonomy, we hypothesis that this is not universal. We follow this up with several challenges to the deep learning research community that builds on the connections and insights in this paper.
Downloads
Article Details
Copyright (c) 2020 Smith LN.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Licensing and protecting the author rights is the central aim and core of the publishing business. Peertechz dedicates itself in making it easier for people to share and build upon the work of others while maintaining consistency with the rules of copyright. Peertechz licensing terms are formulated to facilitate reuse of the manuscripts published in journals to take maximum advantage of Open Access publication and for the purpose of disseminating knowledge.
We support 'libre' open access, which defines Open Access in true terms as free of charge online access along with usage rights. The usage rights are granted through the use of specific Creative Commons license.
Peertechz accomplice with- [CC BY 4.0]
Explanation
'CC' stands for Creative Commons license. 'BY' symbolizes that users have provided attribution to the creator that the published manuscripts can be used or shared. This license allows for redistribution, commercial and non-commercial, as long as it is passed along unchanged and in whole, with credit to the author.
Please take in notification that Creative Commons user licenses are non-revocable. We recommend authors to check if their funding body requires a specific license.
With this license, the authors are allowed that after publishing with Peertechz, they can share their research by posting a free draft copy of their article to any repository or website.
'CC BY' license observance:
License Name |
Permission to read and download |
Permission to display in a repository |
Permission to translate |
Commercial uses of manuscript |
CC BY 4.0 |
Yes |
Yes |
Yes |
Yes |
The authors please note that Creative Commons license is focused on making creative works available for discovery and reuse. Creative Commons licenses provide an alternative to standard copyrights, allowing authors to specify ways that their works can be used without having to grant permission for each individual request. Others who want to reserve all of their rights under copyright law should not use CC licenses.
Xu H, Ma Y, Liu H, Deb D, Liu H, et al. (2019) Adversarial attacks and defenses in images, graphs and text: A review. Link: https://bit.ly/31kxaQv
Guo C, Rana M, Cisse M, van der Maaten L (2017) Countering adversarial images using input transformations. Link: https://bit.ly/31hR7re
Buckman J, Roy A, Raffel C, Goodfellow I (2018) Thermometer encoding: One hot way to resist adversarial examples. Link: https://bit.ly/3i5QQ1t
Engstrom L, Ilyas A, Athalye A (2018) Evaluating and understanding the robustness of adversarial logit pairing. Link: https://bit.ly/3fBm5jq
Papernot N, McDaniel P, Wu X, Jha S, Swami A (2016) Distillation as a defense to adversarial perturbations against deep neural networks. In 2016 IEEE Symposium on Security and Privacy (SP). 582-597. IEEE. Link: https://bit.ly/31oZeSO
Tramer F, Kurakin A, Papernot N, Goodfellow I, Boneh D, et al. (2017) Ensemble adversarial training: Attacks and defenses. Link: https://bit.ly/2BXdWrw
Dhillon GS, Azizzadenesheli K, Lipton ZC, Bernstein J, Kossaifi J, et al. (2018) Stochastic activation pruning for robust adversarial defense. Link: https://bit.ly/2DjwhzG
Feinman R, Curtin RR, Shintre S, Gardner AB (2017) Detecting adversarial samples from artifacts. Link: https://bit.ly/31lFVd3
Song Y, Kim T, Nowozin S, Ermon S, Kushman N (2017) Pixeldefend: Leveraging generative models to understand and defend against adversarial examples. Link: https://bit.ly/3i8bw8X
Samangouei P, Kabkab M, Chellappa R (2018) Defensegan: Protecting classifiers against adversarial attacks using generative models. Link: https://bit.ly/33yM6gV
Athalye A, Carlini N, Wagner D (2018) Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. Link: https://bit.ly/2Dl8Gie
Goodfellow IJ, Shlens J, Szegedy C (2013) Explaining and harnessing adversarial examples. Link: https://bit.ly/2Xu8y74
Jakubovitz D, Giryes R (2018) Improving dnn robustness to adversarial attacks using jacobian regularization. In Proceedings of the European Conference on Computer Vision (ECCV). 514-529. Link: https://bit.ly/2Xv0eDV
Carlini N, Katz G, Barrett C, Dill DL (2017) Provably minimally-distorted adversarial examples. Link: https://bit.ly/2DoHbUO
Carlini N, Wagner D (2017) Adversarial examples are not easily detected: Bypassing ten detection methods. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. 3-14. Link: https://bit.ly/2XuZi2n
Schmidt L, Santurkar S, Tsipras D, Talwar K, Madry A (2018) Adversarially robust generalization requires more data. In Advances in Neural Information Processing Systems 5014-5026. Link: https://bit.ly/33yMfkt
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2017) Towards deep learning models resistant to adversarial attacks. Link: https://bit.ly/3i8Vhsb
Tsipras D, Santurkar S, Engstrom L, Turner A, Madry A (2018) Robustness may be at odds with accuracy. Link: https://bit.ly/30rMifK
Kurakin A, Goodfellow I, Bengio S (2016) Adversarial examples in the physical world. Link: https://bit.ly/3k71wP4
Nakkiran P (2019) Adversarial robustness may be at odds with simplicity. Link: https://bit.ly/2XpZJLi
Zhang Z, Jung C, Liang X (2019) Adversarial defense by suppressing high-frequency components. Link: https://bit.ly/3i6cXoo
Pang T, Xu K, Dong Y, Du C, Chen N, et al. (2019) Rethinking softmax cross-entropy loss for adversarial robustness. Link: https://bit.ly/2DzZWEM
Mustafa A, Khan S, Hayat M, Goecke R, Shen J, et al. (2019) Adversarial defense by restricting the hidden space of deep neural networks. Link: https://bit.ly/30tu1Pj
Mao C, Zhong Z, Yang J, Yondrick C, Ray B (2019) Metric learning for adversarial robustness. Link: https://bit.ly/2Xwl85s
Qi C, Su F (2017) Contrastive-center loss for deep neural networks. In 2017 IEEE International Conference on Image Processing (ICIP) 2851–2855. Link: https://bit.ly/2XvmddY
Wen Y, Zhang K, Li Z, Qiao Y (2016) A discriminative feature learning approach for deep face recognition. In European conference on computer vision 499-515. Link: https://bit.ly/2PswWS1
Wu K, Yu Y (2019) Understanding adversarial robustness: The trade-off between minimum and average margin. Link: https://bit.ly/2Xu5ebW
Galloway A, Golubeva A, Tanay T, Moussa M, Taylor GW (2019) Batch normalization is a cause of adversarial vulnerability. Link: https://bit.ly/31hR2Us
Carlini N, Wagner D (2016) Defensive distillation is not robust to adversarial examples. Link: https://bit.ly/3kdrdh3
Sabour S, Cao Y, Faghri F, Fleet DJ (2015) Adversarial manipulation of deep representations. Link: https://bit.ly/2XrWbYW
Oh Song H, Xiang Y, Jegelka S, Savarese S (2016) Deep metric learning via lifted structured feature embedding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4004-4012. Link: https://bit.ly/31jHf0e
Luo Y, Wong Y, Kankanhalli M, Zhao Q (2019) Gsoftmax: Improving intraclass compactness and interclass separability of features. IEEE transactions on neural networks and learning systems. Link: https://bit.ly/3icnui6
He L, Wang Z, Li Y, Wang S (2019) Softmax dissection: Towards understanding intra-and inter-clas objective for embedding learning. Link: https://bit.ly/3gvAj6A
Ilyas A, Santurkar S, Tsipras D, Engstrom L, Tran B, et al. (2019) Adversarial examples are not bugs, they are features. Link: https://bit.ly/3ic7QTw
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, et al. (2013) Intriguing properties of neural networks. Link: https://bit.ly/2DdBwBf
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT press. Link: https://bit.ly/3fsgCen
Sun K, Zhu Z, Lin Z (2019) Towards understanding adversarial examples systematically: Exploring data size, task and model factors. Link: https://bit.ly/3fz4WGJ
He W, Wei J, Chen X, Carlini N, Song D (2017) Adversarial example defense: Ensembles of weak defenses are not strong. In 11th {USENIX} Workshop on Offensive Technologies. Link: https://bit.ly/33tM9dx
Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE 39-57. Link: https://bit.ly/31lFwax