학술논문

Capacity Abuse Attack of Deep Learning Models Without Need of Label Encodings
Document Type
Periodical
Source
IEEE Transactions on Artificial Intelligence IEEE Trans. Artif. Intell. Artificial Intelligence, IEEE Transactions on. 5(2):814-826 Feb, 2024
Subject
Computing and Processing
Data models
Training
Encoding
Closed box
Data centers
Deep learning
Training data
Backdoor attack
black-box attack
data privacy
machine learning (ML)
Language
ISSN
2691-4581
Abstract
In recent years, machine learning (ML) models, especially deep learning models, have become commodities. In this context, data centers which hold a lot of data often buy ML models from ML model providers, train them on their data locally and use the trained models to provide intelligent services. Existing work has shown that there is a risk of data leakage, which could cause incalculable consequences. Even under the black-box condition, there are still some attacks that can steal the private data held by data centers, and the capacity abuse attack (CAA) is the state-of-the-art attack method. CAA attackers steal the training data by labeling malicious samples with the data to be stolen. However, the label encodings are usually mapped into other output forms, such as categories, and it is impossible for the adversary to know the mapping relationship between the form output by the trained model and the label encodings. Without the mapping relationship, CAA becomes invalid. Aiming at the limitation of CAA, this study proposes a novel practical attack method, i.e., capacity abuse attack II (CAAII), which can find the mapping relationship between the output in the arbitrary form returned by the trained model and the values of the stolen data. Experiments are conducted on MNIST, Fashion-MNIST, and CIFAR10 datasets, and experimental results show that no matter what forms are returned by the model, our attack method can always find the mapping relationship and successfully steals the training data.