학술논문

SPY: A Novel Resampling Method for Improving Classification Performance in Imbalanced Data
Document Type
Conference
Source
2015 Seventh International Conference on Knowledge and Systems Engineering (KSE) Knowledge and Systems Engineering (KSE), 2015 Seventh International Conference on. :280-285 Oct, 2015
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Robotics and Control Systems
Signal Processing and Analysis
Support vector machines
Training
Bioinformatics
Proteins
Protein engineering
Radio frequency
Sensitivity
Imbalanced dataset
Over-sampling
Under-sampling
SMOTE
borderline-SMOTE
Language
Abstract
In recent years, imbalanced class datasets have caused many difficulties influencing on the analysis and understanding of raw data, which support decision-making process in many domains, especially in biomedical data classifications. Although there were a few approaches achieving promising results in applying class imbalance learning methods, this issue has still not solved completely and successfully yet by the existing methods. SMOTE is a famous and general over-sampling method addressing this problem, however, in some cases it cannot improve or sometimes reduces classification performance. Therefore, we developed a novel method named SPY. Experimental results on five imbalanced benchmark datasets from the UCI Machine Learning Repository showed that our method achieved better sensitivity and G-mean values than the control method (i.e., no over-sampling), SMOTE, and several successors of modified SMOTE including safe-level-SMOTE, safe-SMOTE, and borderline-SMOTE.