학술논문

Defending Adversarial Attacks via Semantic Feature Manipulation

Document Type

Periodical

Author

Wang, S.; Nepal, S.; Rudolph, C.; Grobler, M.; Chen, S.; Chen, T.; An, Z.

Source

IEEE Transactions on Services Computing IEEE Trans. Serv. Comput. Services Computing, IEEE Transactions on. 15(6):3184-3197 Jan, 2022

Subject

Computing and Processing
General Topics for Engineers
Resistance
Manifolds
Image reconstruction
Feature extraction
Semantics
Faces
Decoding
Adversarial attacks
artificial intelligence
defense
latent representation
security

Language

ISSN

1939-1374
2372-0204

Abstract

Machine learning models have demonstrated vulnerability to adversarial attacks, more specifically misclassification of adversarial examples. In this article, we propose a one-off and attack-agnostic Feature Manipulation (FM)-Defense to detect and purify adversarial examples in an interpretable and efficient manner. The intuition is that the classification result of a normal image is generally resistant to non-significant intrinsic feature changes, e.g., varying the thickness of handwritten digits. In contrast, adversarial examples are sensitive to such changes since the perturbation lacks transferability. To enable manipulation of features, a Combo-variational autoencoder is applied to learn disentangled latent codes that reveal semantic features. The resistance to classification change over the morphs, derived by varying and reconstructing latent codes, is used to detect suspicious inputs. Furthermore, Combo-VAE is enhanced to purify the adversarial examples with good quality by considering class-shared and class-unique features. We empirically demonstrate the effectiveness of detection and quality of purified instances. Our experiments on three datasets show that FM-Defense can detect nearly 100 percent of adversarial examples produced by different state-of-the-art adversarial attacks. It achieves more than 99 percent overall purification accuracy on the suspicious instances that close the manifold of clean examples.

Online Access

Full Text (IEEE) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송