학술논문

Mining Influential Training Data by Tracing Influence on Hard Validation Samples

Document Type

Conference

Author

Zhang, Qikai; Zhang, Fan; Khan, Samee U.

Source

2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI) ICTAI Tools with Artificial Intelligence (ICTAI), 2022 IEEE 34th International Conference on. :167-173 Oct, 2022

Subject

Bioengineering
Computing and Processing
Robotics and Control Systems
Training
Deep learning
Training data
Benchmark testing
Data models
Complexity theory
Data mining
training data pruning
hard validation sample
influence value

Language

ISSN

2375-0197

Abstract

The ever-growing deep learning model size is constantly driven by the ever-growing dataset size. Mining the influential training data has significant payoff of either reducing the training time, model complexity as well as potentially increasing the model accuracy. In this paper, we propose a few approaches, e.g. classifying the validation dataset into easy, medium and hard levels, introducing influence value by calculating each training data on the hard validation data, to co-prune the validation dataset and the training dataset. Empirically we conclude that the portion of the hard validation data could be used to mine the most influential training data, whereby reducing the training dataset size by 50% without losing accuracy in our experiments.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송